Listening to Cinema: Sound, Epistemology, and The Limits of the Visual

Interview Series: Sensory Cinema: The Culture of Sound | Interviewer: Gökhan Çolak & Arzu Karaduman

Photo: Salome Voegelin, TU Wien

Epistemology and Sensory Hierarchies

Your work challenges the dominance of vision within Western epistemology. How does a sonic epistemology reconfigure our understanding of knowledge production in cinema?

Visual epistemology is an epistemology of autonomous bodies and events. These are thus measurable, classifiable and nameable. The visual relies on the separating function of the gaze, to see the thing before seeing its contexts and relationships which appear in a secondary viewing or measuring. Its knowledge system reflects this priority. And even when it seeks a knowledge of connection, this connection is understood as a measurable connection of two normally separate bodies. In sound this state of separation is impossible. Everything sounds together. Sonic knowledge is thus not the knowledge of a thing or body. Instead, it is the knowledge of relationships and relationalities. And instead of bringing separate items into contact, sound manifests as indivisibility: there is no sound alone, the tone or the phoneme are constructs of a visual, musical and linguistic system. Sound is everything at once, it is the contact, it is how we relate rather than me or you. Therefore, to know what we perceive to be a thing we need to listen to how it sounds with other bodies and more than human bodies to sense what it is contingently. Steven Feld called this sonic epistemology an acoustemology, bringing together the notion of acoustics and epistemology. He was influenced in this naming by the Kaluli people in the Papua New Guinea rain forest where he did his field work in the 1970s and where he abandoned his visual anthropology and ethnographic methodology that named and classified separate object and events to engage in the all together.  Because the sensory density of the rainforest did not allow a discrete view. There you cannot see the tree from the trees, but you have to listen to everything together with everything else to come to know from the together and that together includes you the listener. This articulates an acoustic epistemology that we could engage outside the rainforest too. To hear the world from its indivisibility and appreciate the knowledge that the dense simultaneity of sounds provides about contingent relationship rather than concrete objects. 

Having said all of this, I believe we could see from sound the contingent relationality of the world. What stops us seeing relationally is not the eye as a physiological apparatus, but the entrainment in a cultural visuality that is ideologised by the notion of ownership, extraction, grasping and comprehending the world, rather than knowing with it also ourselves.

Caption: courtesy chaosmagicmusic, Cologne 2023, Kai Niggemann

Ontology of Sound

You argue that sound does not represent the world but produces its own reality. Could you elaborate on the ontological status of sound in relation to the audiovisual image?

I do not actually say it quite like this. What I suggest is that, given the sonic makes a relational world of many encounters, and sounds its indivisible reality, as described in response to your last question, what sound reveals is that the singularity of the world is an illusion. Instead, sound makes accessible as in thinkable the relational plurality of the world. There is not one actual world, that is verifiably true for all of us. Instead, the real world exists in plural slices some of which we find more actual than others, others remain possible only and others appear even impossible, but that does not mean they are not actual.  This world thus exists as many possible worlds from which we negotiate in contingent moments of encounter, a temporary actual world. A shared life-world.

The world that I live is actual for me but only possible for you. However, the sonic possible worlds I talk about are not irrealities or literary fictions, parallel worlds easily subsumed into a greater, unified real actuality. Instead, they are the plurality of this world that questions the singular appearance of what we might term actual, even though we are not verifiably sure that we agree on this actuality. Because the possibilities of this world questions the value and norms of the actual. Their invisible plurality reveals them as an arbitrary and ideological selection and a construct, and not the only real.

Rethinking “Added Value”

Michel Chion’s concept of “added value” suggests that sound enriches the image. Do you see this as a limitation, in the sense that it still subordinates sound to the visual?

Michel Chion is a structuralist for whom the world and by extension film is a text, a semantic system. It is thus readable and knowable on the terms of its signifying structure. His theories unfold within film as such a cultural text to be read and interpreted. Thus when he says sound is “added value” he refers to how in film sound represents added value to the visual, without questioning the separation and consequent hierarchy between visual and sonic film track thus assumed. The film industry is very visually oriented. The visual makes the “pictures”. The sound can contribute to those pictures, adding layers of storytelling and affect, but it is not, within the privilege of the visual, the driving force or the orienting sensorium.

Therefore, when I disagree with him on the notion of added value, I disagree on two counts: one that film is visual and sound can add value to that visual a priori. As instead I understand film to be multi-material and multisensory and the question has to be about producing sense from a complex multisensoriality, not about adding value to a visual thing. Secondly, I disagree with his use of terminology because the sonic is not a thing added post-production, to add value to the primary of the visual. Instead, the sonic is there at the moment of writing the screenplay, on location scouting, in rehearsals, on set, etc. I know it is the reality of much film production that the sound is not part of the pre-production discussions. It is something apparently “added”, sometimes as Foley, sometimes as ADR after the event. And even if it is recorded on location it is added to the film track not as a sound track. But this is Hollywood inspired filmmaking that is a picture book film making of stories on celluloid. There is another kind of filmmaking that understands the indivisibility of the audio-visual, that is a materialist film making that does not add sound to film but understand how their indivisibility produces a scene. And conversely the film critique engaged in that multi-sensory world cannot speak about adding value to a visual track but must contemplate the simultaneity of all tracks, even the absent ones. So nothing gets added because nothing is apart.

Photo: BBC Radio 3 – Late Junction, Max Reinhardt with Salome Voegelin

Phenomenology and Embodiment

Your approach often intersects with phenomenology. How does listening as an embodied experience reshape the spectator’s relation to cinematic space and time?

Again, we need to be particular as to what sort of film-making we mean and what sort of spectator we mean. If my expectation of film is narrative clarity, storytelling in a semantic way, it is probably not desirable for the viewer to propose they should listen to the sound track to sense their being in the film as a being in the world of the film and the film becoming the film through that intersubjective and reciprocal experience. Phenomenology, unlike structuralism, engages perception as a reciprocal process of being in the world which becomes the world it is for us through our being in it. It performs a reduction, an époché, in order to understand this experience rather than an a priori object or event. Phenomenology brackets the apparently known, to get to the experience at that moment. It does not read the world or film, from a pre-existing vocabulary or signifying system, but engages in the experience as a particular vis-à-vis constituted in our being with. Viewing thus becomes a ‘sensory-motor action,’ a doing perception also of sound that generates rather than perceives what it sees, hears and senses. Consequently, given that this sensory motor action of a listening-viewing generates the world of the film, our engagement demands responsibility and care. I am responsible for how I listen and what I hear, and also for what I do not hear. These sensory motor engagements with the world generate my life-world, my sensory world that I understand myself with. I am responsible for this world/ film world I generate from my being in the world/film world.

In this phenomenological understanding of the world as constituted in sensory-motor actions as life-worlds and the understanding of the film world as such a life-world, there is no distance that enables reading. The suspense of filmic reality is not a theoretical but an actual suspense, an époché, that allows us to see things differently and thus to engage in the experiential reality of the film as a reciprocal and responsible world.  

Photo: Salomé Voegelin – University of the Arts London

Silence and the Sonic Negative

Within your framework, how can silence be theorized beyond absence— _as a productive, material, and even disruptive sonic condition?

Silence is not the absence of sound but the beginning of listening. I wrote something like this over 15 years ago in my first book Listening to Noise and Silence: Towards a Philosophy of Sound Art. To me this has only become more pertinent, particularly in relation to politics and what we cannot or do not want to hear and how to start hearing it. It is important to note that silence and noise are not binaries. Instead, they are on a perceptual spectrum, where they’re not in opposition to each other, but highlight extremes that are part of all sounds when they are listened to beyond a referent or name, in the contingency of the relationship that sounds. All sounds can be noisy and all sounds can be silent. And silence can engender disruption and as much as noise can calm things down. Silence can disrupt the flow of the usual, what we think we hear, and how we hear it. It can create uncertainty and fear as we lose the baseline of the signal path.

 We talk about noise in relation to noise-signal ratio and how noise disrupts what we can hear, as in understand and make sense of, turning it illegible, and how in this way noise is the undesirable sound of telecommunication, science and writing. However, silence can equally be heard as “no signal”. Its undesirability is not loud, and thus it is not even noticed. It does not impede the signal but holds the signal in the thick ambience of what we do not know to hear or listen out for. It sounds the conceit of a clear signal by sounding the condition of the unheard, and the excluded.  Once we tune into silence, once we become aware of its potential to make us hear differently, more and otherwise, then we can start to hear the silent hum that is the no signal of a different speech. And that is why I think the notion that silence is the beginning of listening remains and gains in relevance. Because at this political conjuncture we need to hear also what you hint at by the term Negative: that which seems upside down and strange but holds the imagination of a different  world, which in the process of photographic development gets rendered into the shapes and forms we recognise.

Subjectivity and the Unconscious

If sound engages the listener on a pre-reflective or unconscious level, how does this affect the construction of subjectivity in film experience?

I do not think sound engages the listener in a pre-reflective level but demands of us that we not only suspend our disbelief in relation to the veracity or feasibility of the narrative, but that we suspend the reality of the filmic apparatus, its visual organisation and distance, so we might come to understand film as a reciprocal and intersubjective life-world and appreciate our responsibility for what we see and hear. This presumes an ethical listening and an ethical subjectivity that is aware of their participation in the heard and also in the silent, the unheard and ignored. To still use the prefix pre-, I would say that this listening with responsibility and a participatory ethics, understanding one’s role in generating the heard and also what remains inaudible, engages in the ‘preliminary’ rather than the ‘pre-reflective’. I am thinking this term with Hannah Arendt who in her 1954 text ‘Understanding and Politics’ suggests that we have lost the ability to make sense of the new because we rely on the familiar to grasp what is entirely unfamiliar – what cannot be understood within the rules of common sense and the ‘normal’ by which we tend to measure and recognise the real, and to which we thus reduce it. In particular, she addresses the failure to understand totalitarianism, by confusing it with imperialism because of reading it through old signifiers which prevent us from seeing its own particular and new evil.  This renders us unable to pursue appropriate political actions in resistance and to develop a relevant political subjectivity. In response she suggests an emphasis on the preliminary, the not yet named, where a word first appears as a new word and a new sound, and where its newness can be understood. In its preliminary articulation – before it has been folded into existing categories and meanings. Film that engages enables this preliminary, that does not seek to tell the actual from its past representation but allows us to experience the possible on its current terms, can encourage a sense of the preliminary. I suggest sound can enable the imagination of the preliminary, as it does not close itself off in representation but invites an uncertain listening.  And from this listening in a preliminary mode, that is as a sensory-motor action which generates the heard in its unfamiliar newness, a new understanding can be reached and a relevant (political) subjectivity imagined.

Salomé Voegelin – 4th Council of Europe Platform Exchange on Culture and Digitisation, Photo: ZKM

The Political Dimension of Listening

Can listening be understood as a site of resistance? How might sonic practices destabilize dominant regimes of visibility and representation?

Listening in a sense that it is a sensory-motor action, a movement forward from listening into action, is always already engaged in action and thus also in the possibility to refuse or reject an action. I am invested in thinking listening as a paradigm shift; to shift a conventional scopic regime and its focus on things, towards an acceptance and practice of the indivisibility of this world. This in itself represents a rejection of the dominant visual regime which starts from the separate and the discrete, and pursues to measure, name and classify it before bringing it into context and relationality. And thus keeping it always still apart. This rejection is not a not looking, instead it means to practice listening to film work and the world, to appreciate its multisensory indivisibility, and turning the visual sense on its head. What we are left with is how things are by being together and in contingent encounters. This requires a new sense of thinking the world, and ourselves in this world, which naturally destabilizes how we look and what we see. This implies new values and a new understanding and a new imaginary so radical and different that it will never happen. This does not mean we should not pursue it however. Since indivisibility also means reciprocity and demands responsibility, creating an awareness for interdependence, which are all competencies and sensibilities which we need to understand and live with this complexly interwoven and interdependent world and its various political, economic, ecological and social crises.

Photo: The Attic, Salome Voegelin

Interdisciplinary Positioning

Your work operates at the intersection of philosophy, sound art, and media theory. What methodological shifts are necessary for film studies to fully integrate sound as a primary analytical category?

I think there are wonderful films that practice great awareness of the sonic as a concept and materiality and that produce fantastic sound tracks that enable a little what I mention above: the appreciation of the film as a multi-sensory possible world that invites the understanding of the plurality of filmworlds generated. The problem, or rather the emphasis and priority of the visual is strong and maybe unsurmountable. The technology, the way a film set operates is clearly driven from the image. Therefore, on the one hand, to undertake such a methodological shift into a sonic film theory, we would have to reset the film set, rethink how we make films, how we act, direct, edit and track lay. And then we would have to have training for listening to film, to generate a sonic sensibility and come to understand how film generates the plural slices of its indivisible world rather than proposing the meeting of the discreet, the story, the character, the action.

In a sense, and referring to an earlier answer to one of your questions, we would need training to listen to the film’s preliminary experience, conjured in sound, and we would need courage and desire not to fold it always already into history and the familiar but to follow the uncertainty, the perpetual present of sound into the film’s unfamiliar materiality. To start to see film from its sound and thus from its indivisibility of which we are part. To become sonic subjects, human bodies with other human and more-than-human bodies, in a close relationality and responsibility, and write from there. This would not hinder criticality. A criticism often levied against sound for its lack of critical distance when it is not treated in a structuralist scheme.  Instead, a sound theory of film could develop a more relevant and pressing criticality, of lived and heard relationships of a vibrational film practice. It could work from the motor-sensory-action of listening relationally and reciprocally, understood as an effort of generating rather than viewing the film. Thus, we would become vibrational bodies and write theory from our entanglement in the vibrational sphere of film. Always aware of the vulnerability and responsibility of the viewer as listener to what they see and hear and what finally they write about.  The task would be to theorise from that entangled position understanding its responsibility and understanding that thus the rigour of this criticism would be legitimised by the body of the critique rather than canons and pre-existing contexts of how film is written about.  

Photo: Texts + Talks | Salomé Voegelin

Technology and Artificial Sound

With the increasing presence of synthetic and AI-generated sound, how might we rethink authenticity, presence, and materiality in sonic experience?

This is a huge question and I am not sure it is answerable, or that I can answer it truly at this stage. The problems with AI for sound tracks as I see them at this moment, are very similar to the problems of AI in literature or philosophical or any writing. AI does not think in terms of relationship, it does not understand its indivisibility, and contextualisation. It only recognises patterns and frequencies. In that sense it is a scopic tool. And in its quest for the most frequent it erases that which is not often sounded, the marginal, the excluded and discriminated against, and amplifies the most prevalent. In this way it speeds up a hyper-hegemonisation of the visual regime and pursues a data standardisation of its materiality and sense. Thus it erases difference, diversity and plurality and it erases the body as a site of multi-sensory response:  In the end we will only have a hand full of sounds and a handful of words, and the body has gone. AI is the great depletion machine. It does not so much make us rethink authenticity and reality but erases our thinking altogether. We will be confronted with frequency presence whose authenticity has nothing to do with experience and relationships but only with numbers and how often they might appear. AI authenticity is probability and speed. This fits quite well incidentally with the hype of future betting’s markets, markets for trading the future by guessing what somebody might say or do and how often or when. This form of betting on a probable incident rather than on analytical predications accelerates the status of the stock market as casino, and appears to represent AI’s total realisation of the world in frequency terms. This is of course the very opposite of what I hope for with a paradigm shift towards sonic indivisibility, complexity and responsibility. As instead, we are moving with great speed and zero responsibility towards the erasure of an experiential world by numbers and words as numbers. As sound designers and sound artists as well as critically listening viewers, we must ask ourselves at what cost and for what benefit do we want to work with AI?

On the other hand or at the same time sound and a sonic thinking might reveal themselves as the perfect resistance tools against an AI frequency world. And my desired for paradigm shift will happen due to necessity to keep our bodies to keep our lives.

Conceptual Closing

If cinema were to be theorized primarily through listening rather than vision, what would be the most significant conceptual shift for film theory?

To write about film not as interpretation of scenes, and dialogue, and moments and plots but as indivisible materiality that thinks in connecting rather than things by themselves and that creates sonic possible worlds: the plural slices of this world and of the film world, would mean to write about everything at the same time. I am not sure we can write that way. It would be a challenge. But we surely should try so we could develop a cultural visuality from our ears, able to understand the vibrational, indivisible and relational experience of film and sense ourselves within it.

Leave a comment