Article by Alyx Jones
Edited by Sam Hughes
Alyx Jones recounts her experience of this year’s AES Audio for Games Conference back in February.
The 61st Audio for Games conference held at The Royal Society of Chemistry, London, took place Wednesday 10th to Friday 12th February 2016. This renowned event draws many game audio professionals from all over the world to witness the varied interesting subjects being presented over three days. This conference is usually held every other year, but with the growing interest in game audio, as well as the subsequent launch of Virtual Reality, it was decided to have it annually at present.
Kicking off the conference on the Wednesday was Scott Selfon, the engineering lead at Microsoft with his talk “Tales of Audio from the Third Dimension”. Scott opened up by covering many topics that would become the focus of the week such as applications of audio to VR and the problems with HRTFs (Head Related Transfer Functions). The main issue being that the average consumer hasn’t had their own HRTFs measured. Therefore, using the average model lacks accuracy and customisation, since every persons’ ears are so unique. He also touched on the fact that many in-game factors affect sound design, such as humidity, temperature and air currents. Spatial Audio is a hot topic in game audio at the moment, as advances are expected to align with the advances in virtual reality and augmented reality. Selfon also mentioned the current audio volume standards for spatial audio. Head movement is a highly important factor that helps humans locate sounds, but Selfon mentions the “donut” model of approximately 30 degrees above and below the listener, and 60 degrees surrounding, as the optimum placement of sounds before they become difficult to locate. Some really useful information, especially for those of us that are exploring 3D spatial audio and its practical applications.
After a short break, Alex U. Case followed up with “Timbre FX for the sound designer”. It was a really refreshing presentation, urging for sound designers and composers alike to consider the array of effects at their fingertips, and their uses in application to the timbre of a sound rather than just their applications to enhance spatialisation. He said in games, with the potential for so many sounds to be playing at the same time, “it’s important to simplify sounds into their most important characteristics” eg. The 50-60Hz boost that gives a kick drum it’s “punchy” quality. Tools such as reverbs and delays can be amazing for changing or enhancing the timbre of a sound, but even tools, such as compressors, can be used for reshaping the envelope of a sound, thus changing its timbral characteristics. He finished with examples of how distortion can add dynamics and can be used to apply more dynamics to more intense performances or parts of the sound.
After a much anticipated lunch break, Barney Pratt from Supermassive Games then introduced Until Dawn to those of us who haven’t played, (with the warnings of gore and language). He talked about the relationship between the composer, Jason Graves, and the studio. Graves was outsourced by Supermassive and asked to bring his own style to the game, rather than giving him a predetermined genre to write to. With the use of middleware, to enable reactive music systems, they were able to turn something approximating 52 minutes of music into about 8 hours of bespoke music.
He mentioned the importance of not letting players notice patterns in audio, to help them out in game progression. Supermassive were able to use certain musical builds, met with a stinger to lure the player into expecting a stinger, then once they made that association, to then give them a musical/tension buildup, only to then not be met with a stinger, so now the learned pattern has no application to help them in game progression as they can’t gauge gameplay progression elements from the music. Until Dawn is a very filmic experience, because of that, Pratt mentions their difficulty in creating well positioned audio in the scene, because of jump cuts (a traditional film technique that moves the camera instantly between views/positions within the scene/environment). They found the best way of overcoming this problem to be a mix between central audio with the moving audio environment from each jump cut (approximately a 50:50 mix).
After exploring the new possibilities of spatial audio in virtual reality, an evening event was held at the club TigerTiger, sponsored by Dolby. This was a bit different from last years event which was held at Dolby’s headquarters in London. Last year we were lucky enough to get to experience the Dolby Atmos theatre room with many upcoming film and game trailers, including the new World of Warcraft, and also the “living room” setup of Dolby Atmos. Compared to last year, this years Wednesday evening entertainment was a little bit of a letdown, as I remember how good going over to the Dolby HQ had been, plus the added bonus of being able to talk to many members of Dolby staff. So, this year, the music was a bit loud (I know, *grumble grumble*), but as a social event, it was a bit difficult to be social when you struggle to hear the person next to you. Other than that, the first day was a super opener to the Audio for Games conference.
Thursday was opened with a really interesting talk by Jakob Schmid, the audio programmer from Playdead. Since the success of Limbo, they have been able to go on to create a new title called “INSIDE”. With a currently unknown release date, but expected for 2016, it was very interesting to see how they could take their recent success and create a new game from this. Schmid focused a lot on the custom audio implementation they had been doing, although they were using Wwise as their middleware to easily integrate with Unity, they had created their own modded version of Wwise to work with what they wanted to achieve for this game. The most fascinating part was when Schmid was discussing the characters breathing patterns and how they had to match up with his current state (relaxed, frightened, running etc) and they had the problem of sometimes if the player started running, the boys breath wouldn’t be in sync so they solved the problem in a way comparable to how DJs line up records to be played, they either under or overcompensated by adjusting the pitch until the breathing and footsteps were in phase.
After a short break we then were introduced to a piece of kit called “Foley Designer”, that is currently in its prototype stage. Christian Heinrichs introduced us to how we were able to digitally create sound effects through the use of gesture and motion with his custom built hardware/software matchup. He describes the prototype as allowing the user to “Easily control and employ, expressive, generative audio models.” In games where object movements and their sound effects can be indeterminate, a controller such as this allows for the synthesis of potentially endless and flexible sound design. They used the example of synthesizing a creaky door and were able to create this sound by gesturing in a similar way that you might open/close a door with their custom controller. This enabled them to create a sound effect as long or short as necessary, but also means, if you need to re record with foley, it can be difficult to attain the exact same sound, whereas you only have to bring up your saved settings in Foley Designer and re record as necessary. An invaluable tool.
Before lunch we then had the incredible Martin Stig Andersen for the keynote presentation. This had been my most anticipated session, and it didn’t disappoint! Andersen is the famed composer of the game Limbo and well known for his creative applications of audio to the medium of video games because of this. The title of his keynote: “Electroacoustic Approaches to Game Audio” should spark anyone’s interest, as it’s certainly not something discussed regularly in game audio. Andersen talked early on about the fact that many composers in the games industry are a kind of “jack of all trades” with regards to genre and their writing, however he encourages composers and sound designers alike to have their own trademark sound. He compares it to going to eat at a restaurant but them not providing a menu “because they’re confident in every dish”. Limbo for him was a perfect match since it fitted his own ideas perfectly and tied in with his background in audio. He studied Electro Acoustic music at City University. He brings his education to every piece of work in audio he does. Andersen says that he likes to create an ambiguity to the sounds in Limbo, by reducing their recognizable characteristics. He utilizes a variety of fascinating techniques such as using a record player for the boys footsteps and creating timbral analysis of soundscapes and then having musical instruments play an interpretation of these tracks. He than takes these two parts and interpolates between the two, with a truly stunning result. I won’t give away all his secrets but simply leave you with this quote from him “If you can extract the music from the environment, it won’t be annoying because it’s part of the soundscape. You can create a very unique atmosphere”.
After lunch in the main room, we had the very cool experience of a live demonstration of the software Dehumaniser. It’s an incredibly powerful tool, that has amazing out of the box presets for a number of fantastic creature sounds you might need for every enemy/fantastical beast in your game. It’s a good tool for anyone with a voice to create believable effects and the future for Dehumaniser 2 sees a more modular based system, and the possibility of integration/plug in to game engines in the future.
Joe Thwaites from Sony then wrapped the day up with a fascinating look at Playstations steps into the competitive space of Virtual Reality. He talks aout the important role of music in virtual reality, and over the whole 3 days, many speakers have been harkening to the fact that “real” audio shouldn’t be the goal in video games, as they are not simulations, they are emotive experiences.
The final day at Audio for Games started with a much needed coffee and a panel on “Environmental Audio Effects in Virtual and Augmented Reality” chaired by Jean Marc-Jot. The panel allowed each speaker a short presentation time followed by a Q&A session towards the end. Scott Selfon returned and made the point that creating “realistic” audio can be boring and distracting to a player, as well as lacking context within a VR/AR environment. He also mentioned that pre existing audio assets may not always map well to a VR/AR experience, especially with the emergence of ambisonic recordings. Simon Gumbleton added to this that the goal should be to create an experience, not a simulation, and often the best approach is to have a “bed of sound combined with positional elements“. Lakulish Antani also encouraged us to consider a physics based approach to environmental game audio to incorporate elements such as audio reflection, scattering and diffraction. He also weighed up the pros and cons between Wave Simulation VS Ray Tracing techniques in environmental audio. It was a really useful panel and the overarching theme/idea seemed to be that it’s not necessary, or even desirable to create an accurate representation of audio in the real world, as Simon Ashby said “Whoever is responsible for mixing this, is doing a terrible job“.
I spent some time before lunch in the Council Room upstairs, where throughout the conference Fabric Demos had been available via sign up sheets at reception as well as various paper presentations. I opted to sit in on the third paper session, “Binaural Sound for VR“. The first part was “Lateral Listener Movement On The Horizontal Plane: Sensing Motion Through Binaural Simulation“, presented by Matthew Boerum. It was fascinating to see how audio simulations, that featured a crossfade between two sound source locations, gave even more of a sense of motion than an actual audio recording of the same motion and has an incredibly useful application to game objects that might be moving and how a player experiences their movement, especially when they aren’t visible, but very applicable since their study had very similar findings with and without visuals.
Chanel Summers gave the final presentation of the conference in the main room that showcased the Leviathan Project, an AR experience that changes the way stories are told, and how games are experienced. She used this opportunity to showcase the experience of interacting with the small creatures called the Huxleys, almost like small, virtual pet, jellyfish. Via any camera, the app was able to add a layer of either the Leviathan whale, or the Huxleys onto a live feed and interact with the real environment. She described the process of coming up with their voices, starting with herself making vocalisations of how she imagined them to sound, but eventually basing the smaller Huxley on her own dogs sounds of when it particularly wanted attention eg. higher register whines and barks. It was a lovely end to the 3 days and inspiring to see the new situations we are going to have to apply audio to as technology advances.
The Sound Architect