This article presents a case study of Kinolab, a digital platform for the analysis of narrative film language. It describes the need for a scholarly database of clips focusing on film language for cinema and media studies faculty and students, highlighting recent technological and legal advances that have created a favorable environment for this kind of digital humanities work. Discussion of the project is situated within the broader context of contemporary developments in moving image annotation and a discussion of the unique challenges posed by computationally-driven moving image analysis. The article also argues for a universally accepted data model for film language to facilitate the academic crowdsourcing of film clips and the sharing of research and resources across the Semantic Web.
hannibal film clips to download
One illustration of this predicament is the ongoing lack of a database dedicated to something as seemingly straightforward as the analysis of film language. As Lucy Fischer and Patrice Petro lamented in their introduction to the 2012 MLA anthology Teaching Film, "the scholar of literature can do a keyword search for all the occasions that William Shakespeare or Johann Goethe has used a particular word, [but] no such database exists for the long shot in Orson Welles or the tracking shot in Max Ophüls" . In response to the improvements to moving image access described above, the authors of this case study set out to develop Kinolab, an academically crowdsourced platform for the digital analysis of film language in narrative film and media (see ). This case study describes the opportunities and challenges that participants in the project have encountered in our efforts to create, manage, and share a digital repository of annotated film and series clips broadly and deeply representative of film language as the latter has evolved over time and across countries and genres. In this essay, we contextualize our project within related projects, recent efforts to incorporate machine learning into DH methodologies for text and moving image analysis, and ongoing efforts by AVinDH practitioners to assert the right to make fair use of copyrighted materials in their work.
Even as machine learning projects like the MEP and Distant Viewing Lab bring scholars of moving images closer to the kind of distant reading now being performed on digitized literary texts, their creators acknowledge an ongoing need for human interpreters to bridge the semantic gap created when machines attempt to interpret images meaningfully. Researchers can extract and analyze semantic information such as lighting or shot breaks from visual materials only after they have established and encoded an interpretive framework : this work enables computers to close the gap between the pixels on screen and what they have been told they represent. The digital analysis of film language generates an especially wide semantic gap insofar as it often requires the identification of semiotic images of a higher order than a shot break, for example the non-diegetic insert (an insert that depicts an action, object, or a title originating outside of the space and time of the narrative world). For this reason, analysis in Kinolab for now takes place primarily through film language annotations assigned to clips by project curators rather than through processes driven by machine learning, such as object recognition.
Kinolab is structured to help researchers reduce the semantic gap in digital film language analysis in three distinct ways. The most basic form is through a collaborative platform for consistent identification of semiotic units of film language in film clips, allowing sophisticated searches to be done immediately utilizing them. The Kinolab software architecture is also designed for integrating distant viewing plugins so that some film language forms can be automatically recognized by machine learning algorithms from the scientific community. This plugin would also allow subsequent exploratory data analysis based on Kinolab's archive. Finally, Kinolab can serve as a resource for applying, validating, and enhancing new distant viewing techniques that can use the database with information about film language to develop training datasets to validate and improve their results. Given Kinolab's architecture, it can produce a standard machine-readable output that supplies a given clip URL with a set of associated tags that a machine learning algorithm could integrate as training data to learn examples of higher-level semantic annotations, such as a close-up shot. What is lacking in Kinolab towards this goal is specific timestamp data about when a certain film language form is actually occurring (start/stop) which, combined with automatically extracted basic sign recognition (e.g. objects, faces, lighting), would be extremely valuable for any machine learning processes. The existing architecture could be expanded to allow this with the addition of a clip-tag relationship to include this duration information, however the larger work would be identifying and inputting this information into the system. One possible way to address this limitation is to integrate a tool like the aforementioned Media Ecology Project's Semantic Annotation Tool (SAT) into Kinolab. The SAT can facilitate the effort to create more finely grained annotations to bridge the gap between full clips and respective tags, providing a more refined training dataset.
Kinolab is a digital platform for the analysis of narrative film language yet, as previous discussion has suggested, 'film language' is a fluid concept that requires defining in relation to the project's objectives. The conceptualization of film as a language with its own set of governing rules or codes has a rich history that dates back to the origins of the medium itself. This includes contributions from key figures like D.W. Griffith, Sergei Eisenstein , André Bazin , and Christian Metz , among many others. Broadly speaking, film language serves as the foundation of film form, style, and genre. Kinolab focuses on narrative film, commonly understood as "any film that tells a story, especially those which emphasize the story line and are dramatic" . To tell a story cinematically, film language necessarily differs in key ways from languages employed for storytelling in other mediums. As the example drawn from The Silence of the Lambs demonstrates, this is particularly evident in its treatment of modalities of time (for example, plot duration, story duration, and viewing time), and space (for example setting up filmic spaces through framing, editing, and point of view) . Film language can also be understood as the basis for, or product of, techniques of the film medium such as mise-en-scene, cinematography, editing, and sound that, when used meaningfully, create distinctive examples of film style such as classical Hollywood cinema or Italian neorealism. Finally, film language is a constitutive aspect of genre when the latter is being defined according to textual features arising out of film form or style: that is, an element of film language such as the jump cut, an abrupt or discontinuous edit between two shots that disrupts the verisimilitude produced by traditional continuity editing, can be understood as a characteristic expression in horror films, which make effective use of its jarring effects. Kinolab adopts a broad view of film language that includes technical practices as well as aspects of film history and theory as long as these are represented in, and can therefore be linked to, narrative media clips in the collection.
The vast majority of Kinolab's file system overhead goes to storing audiovisual clips. Accordingly, we built the first implementation of Kinolab on a system that could handle most of the media file management for us. Our priority was finding an established content management system that could handle the intricacies of uploading, organizing, annotating, and maintaining digital clips. To meet this goal, we initially adopted Omeka, a widely used and well-respected platform with a proven record for making digital assets available online via an easy-to-use interface (see ). Built to meet the needs of museums, libraries, and archives seeking to publish digital collections and exhibitions online, Omeka's features made it the most appealing out-of-the-box solution for our first release of Kinolab. These features included: an architecture stipulating that Items belong to Collections, a relationship analogous to clips belonging to films; almost limitless metadata functionality, facilitating deep descriptive applications for film clips; a tagging system that made applying film language identifiers simple and straightforward; a sophisticated search interface capable of performing complex searches; and, finally, a built-in administrative backend capable of handling a significant part of the project's file and database management tasks behind the scenes.
Unlike IMDb, TMDb has a clear message of open access and excellent documentation. In testing, it offered as much and sometimes more information than one could access on IMDb. We have concerns about the long-term reliability of a less established source like TMDb over a recognized entity such as IMDb, but since we only make use of this data tangentially we decided that it is provisionally the best option. The metadata that TMDb provides is important for helping to locate and contextualize Kinolab clips, but the project is not attempting to become a definitive source for providing information about the films and series from which they are excerpted. Consequently, we simply reference this kind of metadata via TMDb's APIs or direct Kinolab users to the TMDb site itself. The lack of an accessible, authoritative scholarly database dedicated to narrative films and series is an ongoing problem shared by the entire field of media studies . In the case of the Kinolab project, it has represented a challenge almost as significant as the legal and technological ones outlined elsewhere in this case study. 2ff7e9595c
Comments