The State Of Game Audio

This piece on the state of contemporary game audio was first published in Edge magazine, earlier this year. In it I talk to Marty “Halo” O’Donnell, CryTek’s Florian Füsslin, Introversion’s Chris Delay and the ledgendary George Sanger.

Game design lecturer Tom Betts is feeling pretty downbeat about the attitude of his students towards videogame audio. “I do a few lectures on this topic and unfortunately it often comes down to the fact that while you can play a game with the sound off, you can’t play a game with the screen off.” If you’re studying the things that make a videogame work, sound comes way down the list. Why should Betts’ students worry about what he has to say on the subject of audio when there are so many other things to worry about, like visual design, level design, or the nature of puzzles? “It’s been an underdog for years,” says Betts.

The attitude of his students is understandable, of course, because games have always been such a potent visual medium. Even the most successful sound designers, such as Marty O’Donnell – whose work defined the Halo series – recognise that sound takes a secondary place in our attention. “Because we get tangible information from our eyes and more intangible or visceral information from our ears, most people don’t think about what they’re hearing,” says O’Donnell. “We can gate our senses, but our ears never blink.” O’Donnell points out that even though great film directors such as Steven Spielberg put great emphasis on sound design, it generally only gets passing credit. “Perhaps it’s fair that sound takes a back seat because that’s how we’re wired, but those of us who are sound designers know how much influence we actually have.”

The truth is that sound design has become one of game development’s most sophisticated tasks. Designing music and sound effect systems for use games environments is a rather different challenge from that of simply composing music, or even making soundtracks for films or television. Games present some unusual problems, like the mix having to adjust itself to suit a situation created by the player, rather than the static vision of a single director. Game designers have to have a flexible attitude towards factors such as the amount of time spent listening to the same piece of music and the potential for sonic overload if too many game sounds are played simultaneously. Not only that but many sound designers find themselves working on tasks that are defined entirely by non-musicians and the audio-illiterate: the producers and lead game designers. It can be a serious challenge.

The Gauntlet

Audio, like so much else in a game, has to convey information to a player. CryTek’s Florian Füsslin explained that Crysis’ lavish soundscape was defined primarily by what information the player needs to hear. “We often went for the concept ‘less is more’ or let’s better say ‘important things first’. We used a pretty solid priority system which cuts quiet or unimportant sounds in an audio busy situation like combat. Together with the right mix we were able to provide a dense soundscape in all situations players might run into.” So creep through a jungle and you might be overwhelmed by the seething ambience, but enter combat and your attention is allowed to adjust instantly to the yells of enemies and the position of their gunfire.

Realism often has to take a back seat in the audio systems that games create, even a game with otherwise realistic environments, like Crysis, as Füsslin explained: “Making game audio is often a balancing act between realism and “keep it readable for the player”. For example shooting two assault rifles might sound similar in reality, but in the game the player has to know precisely which weapon has fired. In this case the readability was more important and therefore given the priority.” Games are often hugely truncated in the sensory input that they offer the player, and so audio has to function in a manner that supports what gamer’s can already seen on the screen. The clearer the message, the better.

Another famous example of sound as a “readable” gameplay cues was the audio design by Eric Brosius in the Thief games. His team put a great deal of work into things like footsteps, which enabled the player to instantly comprehend whether he was being stealthy, or noisy. Clanky metal floors and creaky floorboards were boosted up far beyond a realistic level, giving the player the aural nudge he needed to realise that his creeping was no longer going to go un-detected. Likewise the “barks” of the guards on a level had to unambiguous: it was essential that a Thief player know if suspicions had been aroused, or if he had been spotted. Bad game sound is seldom pinpointed by gamers or by critics, but there’s a good chance it could have done a great deal to make a bad game better.

Symbols for Cymbals

However, there’s another far more perplexing factor in designing game audio, and that’s the /art/ of it. It might well be 99% tech know-how and perspiration, but the 1% of artistic inspiration is often what makes game’s soundtrack a success. Just being “readable” is seldom enough. One person who knows this better than most is veteran sound designer and musician George ‘The Fatman’ Sanger, who has been working on videogame soundtracks since he first penned a ten-second ditty for the Intellivision game, Thin Ice, in 1983. “There’s a myth, a fallacy going around in game design that I think is taken as truth, and it’s that the only job of audio is support the rest of the game,” says Sanger. “It sure sounds smart, and people say it ‘I don’t need to innovate, or to write songs, or change the idea of what an orchestra is, all I have to do is support the game.’ The thing they’re getting wrong is that supporting the game is not their only job: you still have to blow someone’s brains out with joy. It takes quite a person to push that truth through the bureaucracy that believes audio is just there to support the game.”

Exemplifying this point, Marty O’Donnell describes the process he faced with the Halo games: “It’s a slow methodical process with occasional bursts of insight and creativity. For me there is a lot of time spent with the artists, designers, and programmers of the game. Eventually, after working on many presentations, trailers, and early playable versions of the game, a pallet of sounds and music emerges.” Unlike visual design, which can be appraised at a glance, audio often need to operate in conjunction with visuals to be understood. O’Donnell recalled that he had to develop his music concepts privately before they could be judged in the context of the game Bungie had created: “If I had told the guys at Bungie in 1999 I wanted to use monks singing Gregorian style plainchant to introduce Halo to the public, that music might never have seen the light of day. Instead I had the opportunity to respond musically to the moment and the drama of what Bungie had created and it just felt right.” O’Donnell, like Sanger, knew that sound design had to take risks and to pursue fresh ideas to reach its potential. He had to push through his idea so that Halo didn’t end up with just another faux-metal shooter soundtrack.

Of course it’s not about pursuing personal agendas either: sound designers have to meld their artistic inspirations into what the game’s visual designers are trying to do, and when both aspects work in unison it completes even the smallest nuances of design, as Introversion Software’s Chris Delay explained. His team found that some aspects of their game simply “felt different” when they had the right noises attached to them. “With Darwinia we noticed the bizarre fact that animations actually looked better when they had good sound effects – the audio was enough to “sell” the animation and convince the brain it was good. Visual effects that didn’t have sound effects to go with them often felt flat and lifeless.” Good sound design, it seems, is symbiotic with good game design generally.

Waveform Change

Of course game audio is not the same beast that it once was, and the technology has changed enormously in the past twenty five years. When it started out there was little more than brief sequences of tonal bleeps, and now sound designers deliver forty-piece orchestral epics to our ears. Talking about the long progress to 2008’s complex soundtracking, George Sanger recalled his early work: “At the very beginning of my career I wrote the music out a piece of paper. I was lucky enough to have a musically literate programmer, who was able to, and I’ll be the first to say this for you, turn it from musical notation into ‘beeps and boops’. Ha! The worst thing you could say about sound design is ‘it’s not just beeps and boops any more,’ at least to the eyes of a sound designer.”

The truth is that game audio wasn’t beep and boops for very long at all: game audio rapidly started to use “Musical Instrument Digital Interface” or MIDI to allow game designers to compose music for games directly, as Sanger did in his pioneering work: “I started using MIDI and people would turn around “you don’t know the first thing about writing music for games”, after I’d been doing it for almost ten years. Of course a little later on I was in the right place at the right time to start using the first MT-32 [Roland’s MIDI synthesizer] on some early games like Loom and Wing Commander.” (Both released in 1990, and both noted for their pioneering attitude towards sound and music). Using MIDI meant that game audio had, from early on it is evolution, access to the immediate profundity of musical inspiration and experimentation. Sanger continued: “When people did orchestral music of classical music, they were just typing in from the paper, because they could. I think I was the first person – well, I’d like someone to prove me wrong on this one – I think I was the first person to use the dynamics and tempo from a good performance in game, in [the case of Loom] a version of Swan Lake.”

While MIDI and sampling allowed access to high quality sound effects and musicianship, game audio was not limited to classical soundtracking the likes of which we’d seen before in film. There was another dimension which game designers had to take into account – the activities of the player and the changes they cause in a game world. The most important element to consider in game audio is the one that dominates the entire medium: interactivity.

Points In Time

The key tool in making audio interactive has been 3D audio rendering. This is the realistic environmental and spatial audio that we now routinely encounter in 3D games, the array of effects that allow helicopters to buzz overhead, or ambient sounds to be tied to particular areas. George Sanger explained a little of how this works, and how its nature limits what sound designers are able to do: “Interactive audio ties sounds to objects. A missile can come buzzing at you, or a looping waterfall sound is tied to a waterfall. At the simplest level it allows you to play music in one location, and another location, and set the volume for each, and determine if they’re going to cross-fade. I can’t do that myself. I can only write it in an email and send it to the programmer, and he rolls his eyes and says ‘why is he doing that’ and so it gets lost. It’s not the programmer’s fault, but when the tool doesn’t exist it’s hard. At the very least there needs to be support for a text file that a sound designer can edit to load into the game and that the sound engine can see to know what to play, how loud, and how often. The sound designer can then load up the game and instantly change how loud the birds are singing.” The lack of such tools is the biggest single stumbling block for sound designers working in the industry today.

That’s not say that there aren’t already some tools that help designers out in creating interesting environmental audio, as Peter Harrison, Creative Labs’ European Digital Media and Relations Manager, explained when he enthused about the technology that makes 3D audio a possibility. “When we released the EAX 2 functionality we made a big leap,” says Harrison, talking about the 3D audio standards that came along with the early SoundBlaster Live sound cards. “The idea of design tools was to showcase this technology by developing for it, but if the technology is going to be successful then you want your ideas to be adopted into the developers’ own tool chain and asset management. Developers do things their own way, and we’re not trying to boss people around, or make money from design tools. A successful tool made by us will make itself redundant.” And this one did.

Harrison explained that Creative had authored a tool called Eagle in 2001, which allowed users to import a map design geometry and then add sound simply by placing boxes round areas and rooms – the zones for environmental audio. All the audio effects and filters that players experienced (such as a noise being in the next room, or acoustically altered by being in a corridor or wide-open space) were placed at the finger tips of level designers. “These could have reverb settings attached, occlusion settings attached, and all the source positions for rendering the listener position,” says Harrison. “The success of Eagle was huge, but that success made it redundant, because now, having been inspired by what we did with Eagle, most developers have integrated this kind of tool into their editors and engines. An Unreal licensee or consumer using UnrealEd will have that kind of functionality in there and be able to use it right away.” Getting new effects to the designers is, Harrison explained, the true frontier of where sound design has to go in the future.

Game audio remains a enticing frontier for George Sanger too. He now runs an interactive audio think-tank called Project BAR-B-Q, which is attended by sound designers as well as the software and hardware fraternity that supplies their tools. Sanger believes that there’s still a long way to go before sound designers actually get what they need from gaming audio technology, and his think-tank is designed to help that along. “There’s a lack of consciousness and there’s a lack of tools. This is because there’s no equivalent of General MIDI for interactive audio.” Sanger hopes that tools such as IXMF, a standardised, open-source, container file-format penned in part by his wife and other Project Bar-B-Q attendees, will unlock the potential for game audio in the future. “That would take out the whole, primitive low end of audio,” says Sanger. It would, in short, start to provide more of the kind of tools that visual artists have access to for quite some time.

Up Tempo

If there’s one thing that’s clear from a survey of of current-gen game development it’s that while we’re currently bathing in the glow of next generation visuals, we still haven’t quite benefited from next-generation audio. Small advancements are being made all the time – such as the HRTF systems that mimic surround-sound effects on headphones – but there are still some big steps to come.

Harrison was keen to point out that companies like Creative are leading charge into next generation of audio, and that once their innovations are widely adopted they’ll have ramifications for the game audio we experience on a day-to-day basis. “There are a number of real-time effects that are becoming particularly important,” says Harrison. “If we look at all the reverb and filtering effects they’re what we call ‘time domain effects’. To explain this: if you look at a wave editor you see a 2D graph, with time and volume, and the time domain effects effect changes in these two dimensions. But then there’s a third dimension to sound (rather than space) which is frequency, and we can have our sound data in three dimensions, which is the amount of sound energy in different frequencies. Once you have sound data in this domain there’s a lot more you can do with it.” Harrison cites “Rockerfeller Skank” by Fatboy Slim as an example of these kinds of effects in action. “That bit where vocals are stretched out? That’s it.”

Frequency domain processing will give sound designers far greater flexibility and control over the processes that they can drop into audio in real-time: “Once you have sound mapped to time, volume and frequencies many more effects and processes become available, especially with stretching and distorting sounds. You can analyse dialogue samples and change the way people’s voices sound, and so on. This is starting to be used in games already.” When companies like Creative fully get to grips with frequency domain processing in game audio we’ll see some big changes in what sound designers are able to do.

But perhaps the most vital part of any next-generation audio will be aesthetic sensibility and artistic innovation. “I think creative use of silence can be important,” says Harrison. “If you look at Ico on the PS2 there’s a lot of space in the soundtrack and a lot of quiet, ambient sounds. There was often not a great deal going on, and as a gamer I really appreciated that experience. I think there should be a little more consideration for these kinds of approaches in game soundtracks.”

Harrison is not the only one who sees scope for greater creativity in game audio. “No one has conquered game audio,” says Sanger. “The greatest of them all, for a while there, was Michael Land. He created the music for The Dig, which is on a record label for a reason: it’s good. But it’s a linear piece that got a record deal with Angel or whatever… Afterwards he came to me with his big beard and he said “ I don’t think interactive audio will ever really be possible, it’ll never be great art.” This is one of the greats saying this… and this is because one of the most important aspects of music is /timing/. You need to know what’s going to happen and when. Composing interactive music for games is like, well, rather than making a painting, you’re mailing colours and a list of directions to some kid who wants to look at the painting and getting him to put it together. That’s the massive, impossible goal. And the remarkable thing? We’re getting close. Every week I hear about some idea, or some young guy comes along with a new angle. It’s tantalising. I think there’s going to be an explosion of interactive audio art, and it’s going to happen because of games.”

Returning to game design lecturer Tom Betts in his Huddersfield studio we begin find that there are a number of reasons to think that game audio’s evolution still has much to do. “The problem with game audio, particularly music, is often how quickly it can adapt,” Betts explains. “Say in Tomb Raider audio might be triggered by location, so I can step into a giant vista and soaring symphonics star up. Then I turn right around and hide in the murky brick tunnel I came from… the audio doesn’t react fast enough, so I get a symphonic brick tunnel.” There is a solution to this kind of adaptive audio, says Betts, and it might well hold the key to the future of both art and technology. “Generative audio can potentially produce tracks down to a more granular level,” says Betts. “So, for example, the drum track of a piece playing could have extra hits added while you are in combat as you actually hit, like in Rez. It could also change other elements of the audio as it’s running by altering parameters on the fly. This only works if the audio is being semi-composed in real time… of course it’s hard to do, and really processor intensive, so people don’t do it.” It seems that the solutions to creating a new path in game audio are already there, but it’s a hard road to take.

Sanger offers a similar diagnosis: “Game audio is getting one little aspect improved here, one little aspect there, but there is no example of the thing that’s as different as Katamari, as fun as Guitar Hero, as interactive as Monkey Island, and still uses its instrumentation in a way that is thoroughly musical. We run from this, and I don’t like it. We start talking about business, about how it’s possible to do a 40-piece orchestra.” The only solution, Sanger suggests, is for a Miyamoto of game audio to step up and shake the entire industry’s foundations. Only by fighting the corner for sound design, and moving the bureaucratic mountains that get in the way, is anything going to get done. “It is rough, and the stories [about sound design troubles] are daunting,” says Sanger. “But those stories, those experiences, are the only thing that will take a newbie and turn him into a bad ass legend.”

