The Anatomy of Sound and Image in Contemporary Media: A Conceptual Journey with Arzu Karaduman in the Footsteps of Chion |

Interview Series: Beyond Synchrony: Dialogues on New Media and Sensory Aesthetics

1. Within the framework of your academic trajectory and theoretical orientation, what were the main intellectual or aesthetic motivations that led you to focus on moments in which synchronization breaks down? Could you explain how this interest emerged and what kind of shift it created in your research path?

My focus on asynchrony emerged from a moment of analytical failure. One evening at Georgia Tech, I was watching Nuri Bilge Ceylan’s Once Upon a Time in Anatolia with friends from the Turkish Student Organization who organized the screening. Midway through a dialogue scene, something happened that stopped me cold: the characters’ voices continued, but their lips no longer moved. I remember physically turning to scan the faces of my friends sitting next to me, curious to see whether anyone else was as startled as I was. The shock of the moment of my realization that I witnessed a genuinely new technique of cinematic audiovisual asynchrony compelled me to consult with my cohort as well as my professors in the Moving Image Studies program at Georgia State University. I knew this new technique was not an instance of internal monologue, not acousmatic voice, not a voice-over, not an ellipsis; none of the established categories in film sound theory applied. Out of my fascination with this technique emerged the concept of the “cryptic voice,” a voice that is simultaneously present and absent, uttered and withheld, audible yet refusing to align with the moving lips of its speaker. One of my most exciting publications is the forthcoming chapter in the Oxford Handbook of Media and Vocality, because it will introduce this foundational term more fully with an extended analysis of the dialogue scene in Once Upon a Time in Anatolia as well as another scene in Ceylan’s Three Monkeys.

The cryptic voice became the conceptual spark that redirected my research toward identifying and naming the new sound-image relations as they emerge in contemporary cinema. This shift eventually led to my broader methodological framework, anasonicity, which examines what I describe as spectral, barely audible, or structurally “unsyncable” sounds in contemporary global cinemas. My project “Sounding Anew: Anasonicity in Contemporary Global Cinemas” revisits existing film sound terminology and proposes “anasonicity” as a new methodological approach designed to address emerging sound techniques that transform conditions of audibility and inaudibility in contemporary cinematic experiences. Taken together, these sounds radically disrupt synchronization and require new modes of listening, while the films that deploy them unsettle linear temporality by rendering the sounds of past, present, and future indistinguishable within their narrative worlds.

I call “Sounding Anew” the sonic counterpart of Akira Lippit’s Atomic Light (Shadow Optics). The conceptual seed for anasonicity –or asonority, as I use the terms interchangeably– was planted in Lippit’s formulation of avisuality, his term for the paradox of what is visual yet invisible, an impossible type of visuality that emerges with the birth of cinema, the X-ray, and psychoanalysis in 1895. Lippit’s insight is that by the late twentieth century, the image itself had begun to exceed the limits of visibility. Anasonicity takes up that provocation on the terrain of sound. If avisuality charts the limits of seeing, anasonicity attends to a parallel shift in our experience of hearing that happens a hundred years later: sounds that slip between the audible and the inaudible, voices that fall out of synchronization in completely new ways, sounds that refuse to anchor themselves in time. Attending to the contemporary anasonic nature of cinema then, I name the emerging sonic techniques that trouble what we think sound is supposed to do in cinema, and, by doing so, ask us to critically attend to such moments that demand a new ear and a new thinking.

2. Your work appears to resonate with Michel Chion’s approach to the sound–image relationship. How has Chion’s theoretical framework shaped your scholarly orientation, and in what ways do you expand, reinterpret, or challenge the conceptual space he opened?

Michel Chion remains foundational for thinking about cinematic sound: his attention to the phenomenology of listening created the conceptual template many of us have inherited. While serious scholarly engagement with sound and sound–image relations began in earnest with the 1980 Cinema/Sound special issue of Yale French Studies under Rick Altman’s editorship, it was Chion’s Audio-Vision that became truly indispensable to the evolution of film sound studies. Since the 1980s, the field has expanded and transformed, but Chion’s framework endures as one of its most generative intellectual anchors.

I was particularly impressed by Chion’s capacity to generate incisive terminology in Audio-Vision, especially his formulation of the acousmêtre, which offered a model for how conceptual precision can illuminate phenomena that had long remained elusive. Among all the formal elements of cinema, sound is notoriously difficult to analyze, and Chion’s work demonstrates a rare patience, rigor, and passion for close listening.

Chion visited Atlanta to give a talk at Emory in 2017. Having encountered the cryptic voice in Once Upon a Time in Anatolia, I carried my bewilderment directly to him. After his lecture, I approached him to recount the dialogue scene and to ask what he made of the voice emerging from unmoving lips. He knew the film well and immediately remembered the scene, and yet his response, “It’s just the ambience!” sounded unexpectedly dismissive and was invaluable precisely because it exposed the limits of our established vocabulary. My aim is not to overturn Chion’s legacy but to expand and complexify the conceptual field by naming new audiovisual phenomena that contemporary cinema is producing. In this sense, I see terms such as anasonicity, cryptic voice, echoing sonic flashback, and muted image as the next theoretical steps after Chion: concepts that build on his groundwork but are calibrated for an emerging audiovisual landscape and explained through deep philosophical engagements.

3. The original English terms you have developed to describe moments in which synchronization slips, breaks, or is intentionally disrupted offer a significant contribution to the literature. How does your process of conceptual creation unfold? What theoretical, aesthetic, or phenomenological criteria guide the emergence of a new term?

A new term never precedes the phenomenon; it arises only when a film insists on it. My process is grounded in close listening —what I call a gesture of “listening through,” borrowing from Derrida’s method of “reading through” texts against themselves— and in allowing films to challenge the limits of the theoretical lexicon we already possess. This careful act of listening through these films involves returning to a scene again and again, hearing it anew each time, in repetitions that arrive with difference and produce something new each time. After all, many of the sounds I study are barely audible, and some of the techniques I name appear only fleetingly in most films rather than in extended sequences like the example in Once Upon a Time in Anatolia. So an attentive ear is the key to the process.

Sometimes colleagues and friends help direct my attention to certain films. After my first presentation on the “echoing sonic flashback” in The Revenant (Alejandro González Iñárritu, 2015) at the Sinefilozofi Symposium in 2022, Dr. Serdar Öztürk mentioned a brief but striking use of the cryptic voice in Pelin Esmer’s Something Useful (2017), which I am presenting on at this year’s symposium. I am equally grateful to Jordan Chrietzberg, who recommended The Zone of Interest (Jonathan Glazer, 2023); to Jazmine Hudson, who pointed me toward Sinners (Ryan Coogler, 2025); and to Cameron Kunzelman, who suggested Memoria (Apichatpong Weerasethakul, 2021). These recommendations become invitations to texts that demand to be listened to with care. I am currently extending my research on what I term The Anasonic Zone of Interest, have begun developing a piece on Sinners, and still await the opportunity to encounter Memoria, whose limited circulation has made it particularly difficult to find.

To clarify the process of conceptual creation, I could list three simultaneous criteria that guide the emergence of terminology:
• Phenomenological precision: What exactly is being heard? At what level of perception: audible, barely audible, spectral, remembered, virtual?
• Narrative function: How does the sound alter temporality, embodiment, relations to memory, or the ethical space between characters?
• Theoretical necessity: Can existing terminology account for the phenomenon? If not, what new concept is required, and what conceptual gap does it fill?

I call these subcategories of anasonic sounds “impossible,” because their functions stretch the boundaries of audiovisual asynchrony as defined in established film sound scholarship. Cryptic voice, for instance, emerged from recognizing a voice that is spoken, heard by other characters, and fully audible—yet unaccompanied by lip movement. Echoing sonic flashback, which I explore through Park Chan-wook’s Lady Vengeance in my recent chapter for Derrida and Film Studies, names a distinctive form of aural flashback that operates like an echo, where past sounds reverberate closely following the present sounds like an echo. The muted image (bridge), which I introduce in a forthcoming 2027 article for a Derrida Today special issue on Anatomy of a Fall (Justine Triet, 2023), describes an impossible form of synchronization between images and sounds across two scenes, creating an impossible match that dislocates spatial or temporal continuity.

In each case, I am identifying an impossible doubleness: sounds that are both present and absent, synchronous and asynchronous, grounded or embodied and spectral. I guess my genuine curiosity drives the will to coin new terms each time I notice a mismatch between sound and image in contemporary films. Ultimately, conceptual creation begins with listening to what cinema is doing—and inventing terminology only when existing language can no longer describe its operations.

4. In contemporary cinema and television, the sound–image relationship is increasingly heterogeneous, fragmented, and often deliberately detached. How do you interpret this trend in relation to the broader transformation of contemporary narrative structures? What does this growing separation reveal about the perceptual habits of today’s audiences?

Contemporary audiovisual storytelling has moved further from classical notions of linearity, audiovisual unity, and strict synchronization, even in realist films or TV dramas. Rather than treating the soundtrack as a stable accompaniment to the image, or simply as its subordinate, many contemporary films mobilize sound as an autonomous and sometimes unpredictable force, which I find exhilarating. This fragmentation or destabilization reflects a broader transformation in contemporary narrative structures; stories increasingly unfold not as single temporal continuums but as intertwined temporal planes: memory, anticipation, dream, trauma, regret, and potentiality. For instance, my work on “crystal sounds” in contemporary global cinema and television traces multiple instances of these destabilizing sonic formations, even in otherwise completely disparate texts such as Barry Jenkins’s Moonlight and HBO’s Westworld. And I am certain there are further cases that similarly stray from conventional sound–image coherences.

New forms of asynchrony, in this context, become a perceptual challenge even in the already fragmented and contemporary narratives. These texts ask audiences to feel before they identify, to listen before they decode, and, to borrow from my own method, to “listen through,” again and attentively. Their refusal of easy comprehension is not unwarranted; I think they resist disposability. These works gain their ontological power from their radical sonic, audiovisual, and narrative experimentations. They force us to return to certain scenes repeatedly, to be able to engage with them at the philosophical level they operate.

Contemporary viewers are accustomed to media environments where multiple temporalities and sources coexist; streaming interfaces, multi-screen displays, algorithmic feeds inundate our everyday realities. Their perceptual habits have become layered, fragmented, and non-linear. Cinema is responding in kind, producing radical forms of asynchrony that not only resonate with these habits but also challenge the audiences further by demanding deep philosophical engagements.

Many of the films I study enact what Derrida calls différance, a temporal and spatial deferral, or what Deleuze theorizes as the “crystal,” an indiscernibility between the actual and the virtual. In this sense, the separation of sound from image is not fragmentation for its own sake. It is a mode of attunement to contemporary subjectivity, a way of making perceptible the disjunctions, overlays, and spectral echoes that define our media-saturated lives. And it is often precisely this radical rethinking of audiovisual relations that allows these films to do philosophy.

The digital media ecosystem—including streaming platforms, social media videos, and multi-screen environments—introduces new technical and aesthetic challenges to sound–image synchronization. How do you think these environments reshape the audiovisual relationship? Do you see these synchronization shifts evolving from technical glitches into deliberate aesthetic strategies?

Yes, what once appeared as “errors” or “glitches” are now being absorbed as expressive strategies. Digital media environments including streaming platforms, TikTok videos, algorithmically compressed sound files, autoplay transitions, and skip functions normalize the experience of rupture, elision, and discontinuity. Cinema has responded by formalizing these experiences: asynchronous editing, displaced soundtracks, spectral voices, or echoes of the past that intrude on present time. For example, the echoing sonic flashback or the cryptic voice are not accidents of mishandling but deliberate manipulations that express heterogeneity of time through memory, trauma, displacement, or temporal paralysis.

Of course, tight synchronization between image and sound and the accompanying expectations of temporal continuity and linear narrative progression remain the norm if we consider the thousands of films produced globally each year. However, the shift from analog to digital has introduced new aesthetic sensibilities and technical possibilities that continue to reshape what filmmakers can do with the soundtrack. For example, Mark Kerins was one of the first scholars to trace a level of sonic complexity to the creative potential of surround-sound multichannel formats. Others have similarly noted the increasing indistinction between sound effects and music in contemporary digital sound design, where layers of sonic material can be manipulated with extraordinary precision.

Digital tools have made it possible to craft soundtracks that are denser, richer, and more structurally complex. As a result, synchronicity is no longer the default formal expectation but merely one option among many. Digital media and digital culture defined by compression artifacts, algorithmic modulation, nonlinear temporality, and platform-specific listening habits have fundamentally transformed the conditions of auditory perception. In my scholarship, I see that cinema has responded the transformed conditions of audibility by experimenting with the dramatic and philosophical possibilities of what I call “unsyncability”. Conversely, and perhaps more intriguingly, we can argue that cinema has anticipated and even instructed the audiovisual logics of emerging technologies and those who design them. For instance, I claim that the technique of the muted image that foregrounds voice as media in the impossible synchronization between the voice and a pair of foreign lips reappears today in the artificial synthesis of prosthetic voices and faces in deepfakes and AI-generated content of our current media ecology.

Publishing all your work and terminology in English makes your concepts more visible in international scholarship. How does this linguistic choice influence your theoretical framework? In what ways does producing terminology in English shape the nature or boundaries of your conceptual work?

To be completely honest with you, I have never pursued scholarly work in any language other than English. I attended Zonguldak Atatürk Anatolian High School, where nearly all courses were taught in English. My B.A. in American Culture and Literature and my M.A. in Media and Visual Studies at Bilkent University continued this trajectory, as English was the institutional language of instruction. As a result, my intellectual formation, reading habits, writing practices, and theoretical vocabulary have all been shaped entirely in English.

At the same time, English is the primary language of global academic discourse in Film and Sound Studies, and developing my terminology in English ensures that the concepts circulate widely beyond national contexts. I see scholars writing in languages other than English like German, Portuguese, or Finnish citing my published works. I am not sure that publishing these concepts first in Turkish would have enabled that kind of international reach.

English also imposes a productive rigor. It demands conceptual precision: a new term must justify itself etymologically, analytically, and philosophically. This pressure toward clarity ultimately strengthens the concepts. For example, asonority could not have existed merely as a convenient linguistic parallel to Lippit’s avisuality, that is an elegant analogy invented in the final sentences of a 16-hour comprehensive exam. That day at GSU, I simply coined it without knowing what it meant or how to fully theorize it, and I finished my exam with a long list of questions in the space allocated for answers. Asonority/anasonicity had to accrue methodological and analytical clarity, enough to withstand the scrutiny of my dissertation committee: Angelo Restivo, Alessandra Raengo, Calvin H. Thomas, and especially Akira Lippit as my outside reader. I am deeply grateful for their patience, which allowed the concept to mature into the methodological framework it finally evolved into. In short, English, despite being my second language, has been a conceptual and philosophical playground for me throughout my entire academic life.

Looking toward the future of your research, what new theoretical questions are you pursuing within the study of sound–image relations? Are there particular themes or conceptual directions you plan to deepen in your upcoming work?

My current trajectory continues to expand the conceptual umbrella of anasonicity. At present, I am in conversation with Palgrave regarding my first book project, which will likely take the form of a short pivot, given that I have already published several peer-reviewed articles and chapters that have divided the larger project into smaller, thematically coherent components. I am also working on an article titled “Au revoir to voix: Muted Images in Anatomy of a Fall,” which introduces the term muted image as a technique that produces an impossible synchronization by pairing the visuals of one scene with the soundtrack of another. To my knowledge, the first use of this technique appears in Park Chan-wook’s Lady Vengeance (2005) and later at the climax of Justine Triet’s Anatomy of a Fall (2023).

A second term I am developing is the “meta-burden of representation,” which I use to analyze the self-reflexive structure of Cord Jefferson’s American Fiction. Here, Jefferson responds to the long-discussed “burden of representation” placed on artists from marginalized communities, yet does so within a work that becomes, through its very critique, burdened by the same representational expectations. This concept expands existing theories of race and representation by foregrounding the recursive, self-conscious pressures placed on creative labor itself.
Concurrently, I am pursuing a chapter for Bloomsbury’s The Music Video Industry: Interviews, Close Looks, and Takes, in which I examine the expanded terrain of the music video through an interview-based study of The Seasons, a large-scale audiovisual collaboration between composer Sebastian Currier and filmmaker Paweł Wojtasik. Among the questions I will bring to the artists first and then elaborate upon analytically in the second part are: How might we understand the lineage between expanded cinema as presented in concert halls (where films are screened with live accompaniment) and in museums (as installation-based, multi-format objects) and the contemporary music video? And, conversely, do music videos or experimental films with a music-video logic—Álvarez’s Now!, Conner’s Cosmic Ray and A Movie, Anger’s Scorpio Rising, Workman’s Precious Images, Devo’s Mongoloid—inform The Seasons’ approach to structure, rhythm, and montage?

Finally, although my published scholarship has thus far been exclusively in English, I intend to return to Turkish cinema with sustained attention. I have long been drawn to the sonic textures of New Turkish Cinema (mid-1990s to the present). Therefore, my next major project will be a second monograph on the sounds of this cinematic movement, exploring how the oeuvres of Reha Erdem, Pelin Esmer, Nuri Bilge Ceylan, Emin Alper, and Tayfun Pirselimoğlu respond sonically—as much as thematically—to the country’s evolving political landscape. This project will allow me to bring my conceptual framework into dialogue with the cinematic traditions that shaped my sensibilities, potentially in a bilingual format.

Across these endeavors, the guiding question remains constant:
What new forms of listening does contemporary media demand, and what new vocabulary must we devise to account for them?

Share this:

Related

Leave a comment Cancel reply