What next-gen consoles really mean for audio
Sumo Digital discusses the audio tools that could be a game changer for PlayStation 5 and Xbox Series X|S
With the Xbox Series X|S and PlayStation 5 launching, years of work from Microsoft and Sony is finally in the hands of players.
And while technological advances such as ray tracing and 4K at 120 frames-per-second have been widely discussed, there are other areas that have quietly seen breakthroughs. Sound is one of them, with 3D audio making its way to both Microsoft and Sony's consoles.
"When we think of next-gen audio, a lot of it is [about] audio tech that has existed in smaller forms before, which is being introduced to a wide audience," says Daan Hendriks, lead sound designer at The Chinese Room. "From that angle it's just very interesting that it's basically becoming mainstream. I'm specifically thinking about something like 3D audio, where you add verticality to surround sound, which is something that was more or less introduced with VR. But now it's coming to the bigger console experience. It really allows us as sound designers to create more immersive audio experiences, and more immersive games."
"The most exciting thing about next gen is audio is in the conversation" Jey Kazi, Sumo Digital
The PlayStation 5 comes loaded with its dedicated sound technology, Tempest, which enables 3D audio through headphones. This has been a focal point in Sony's marketing campaign, including in its Play Has No Limits ad from August. The platform holder is supporting this new feature by launching dedicated headphones alongside the PS5, the Pulse 3D wireless headset, but 3D audio can be enjoyed on any pair of 3.5mm headphones or earbuds. Sony has also shared its ambitions of ultimately being able to have 3D audio coming through regular TV speakers.
This component of the PS5 was widely advertised, but the Xbox Series X and S actually offer exactly the same feature. Microsoft's two consoles come loaded with 3D spatial audio which, much like the PS5, can be enjoyed with headphones. The Series X|S supports object-based audio codecs DTS:X and Dolby Atmos, and also has a dedicated audio chip.
And while 3D audio might not sound like a massive selling point to some people, it's a big deal for sound engineers, designers and programmers across the industry.
"The most exciting thing about next gen is audio is in the conversation," says Jey Kazi, senior sound designer at Sumo Digital. "We've not been in the conversation really ever, so for me [that's] the most exciting thing. The new hardware is great, but just the fact that big companies are using [3D audio] as a selling point is amazing. It means that hopefully people can appreciate good audio."
While it's undoubtedly good news that this new generation of hardware is putting audio on the map, both in the industry and among players, there's a catch.
"The crux of that is that we're not supposed to be noticed. It's a bit of a double-edged sword that we have to deal with as audio people in the games industry now. [It's like] when you watch a film. Audio is 50% of the experience but it's not really supposed to be noticed. Usually when it's noticed, it's either really bad, or really loud. Our job mainly is to immerse the player and give them information. It's about storytelling."
3D audio adds a layer to what sound designers do, and a bit of pressure too, as it makes their work more noticeable. That can be a challenge when your job is to be as subtle as possible in most cases.
"People won't start raving about sound design the same way they will about really beautiful graphics that are just instantly recognisable"Daan Hendriks, The Chinese Room
"It's not like as sound designers you sit there hoping to get noticed," says Hendriks. "It's more that, for us, audio is a tool to steer the way players should feel during gameplay, and that's not necessarily supposed to be in your face. People won't start raving about sound design in the same way that they will about really beautiful graphics that are just instantly recognisable. With audio it's more supposed to be a sort of psychological, in the background experience."
With the advent of 3D audio partly changing the nature of the work sound designers do, that means there are new tools that they need to learn to master -- a challenge that everyone has to face, whether a newcomer or a veteran.
"An interesting thing with the advent of 3D audio is that it makes you think about how to design sounds, because it's an effect," says Joe White, senior programmer at Sumo Nottingham. "And it certainly changes things. That's new skills that people need to learn, but it's new for the existing industry to learn how to manage it as well."
While the fundamental tools haven't really changed, the way sound is experienced has. This means some aspects of the tools are getting more weight than before.
"For instance, one very popular bit of middleware that is used in many projects across the industry for audio development is Wwise, and the next version that they are working on will have an even heavier focus on 3D audio than it had so far," Hendriks says. "So that of course is in line with the general trend that we're seeing. And there is definitely a lot to learn there for sound designers, whether you are experienced or just entering the industry.
"It can be kind of a new way of thinking about how to develop your sounds. If you've never had to design audio with HRTF filtering in mind, it's all of a sudden quite a different workflow, where you have to really bear in mind that the sounds that you're designing are not going to be represented with the same fidelity as they normally would. You really have to run them through those filters to see whether your sounds are still sounding the way that you want.
"Certain aspects might be highlighted in a way that you didn't intend. White noise elements to a sound, when they are run through HRTF filtering, might sound a bit more watery and not as impactful. So whether you are experienced or not, that is just a new paradigm you have to learn to work with."
The Series X and the PS5 should deliver a similar experience for the end user, despite using different tech. And the fact that they both have a dedicated hardware component for audio is promising for developers.
"What's interesting with the next gen stuff is they've improved on having dedicated hardware for doing specific audio calculations," says White. "So Sony's got the Tempest system that includes the audio coprocessor and it optimises the workflows that you want to do for solving problems like 3D audio, or doing other kinds of digital signal processing (DSP) that you might tend to do in a game."
Beyond 3D audio, there are bottlenecks that sound engineers have been facing that will be made much easier with the new hardware. Ray tracing is an improvement that impacts audio too, allowing developers to better simulate how sound is traveling through the environment.
"I would also point out things like convolution reverb," Hendriks continues. "We have more space to use that now as we would have had in the past. There's less risk of a performance hit when we use really high quality reverberation with this next gen set of hardware."
"What's interesting with next gen is they've improved on having dedicated hardware for doing specific audio calculations"Joe White, Sumo Nottingham
Kazi adds: "The sheer amount of memory [in the new consoles] is a big thing, because it means that we're not worrying too much about the sheer amount of content that we're putting in, whereas with the previous gen that's always something that you have to think about. That also means that we can have multi-channel sound files, which on the previous gen is again something that you had to really think about, because you can't have too many streams at once.
"Because we have more power, that means that we can run more real-time effects and DSP which does open up the window to more creative possibilities. At the moment we're quite restricted with that. We have reverbs, but we don't really get creative with effects."
And beyond the launch of the PlayStation 5 and Xbox Series X|S, there are exciting things to look forward to for sound designers such as procedural audio, even if it might be a few years ahead. Procedural audio means the sounds are generated while the game is running rather than using a pre-recorded sound file.
"It's great to see that something like 3D audio is now becoming readily available to us," Hendriks says. "[But] there's things like procedural audio that we're waiting for, as it becomes more and more of a mainstream thing. I'm really looking forward to procedural audio becoming more of a viable option than it has been so far. That requires hardware that is capable of doing that, and a lot of R&D time, investment and forward thinking, which I just hope to see more of."
White continues: "[Procedural audio] is a similar approach to the graphical technology where you run shaders, which are little programs that will determine how things might visually look or how the geometry is arranged. That would run just while the game's running and change depending on the parameters of the game. Thinking about how the actual source sound is generated, not just how it's positioned in the space but how you interact with it and how it changes its sound depending on what the interaction is -- that's a whole field that's kind of unexplored and I think there's a lot of scope for it. And the new hardware hopefully allows us to start seeing it as a realistic option."
Beyond pure audio design, music is next in line in terms of innovations to look forward to on the new consoles. 3D music would add another layer of challenge for designers though, as game soundtracks are supposed to be non-diegetic, meaning it's a sound that doesn't originate from the world of the game.
"It's supposed to be up behind what's going on, just like in a film, so it's a big challenge for just general creative direction," Kazi says. "Is the music coming from the world or is it still non-diagetic? I think that that is a big creative challenge that would be exciting to see what people do with that."
White has a background in interactive music applications, and he's excited by the possibilities offered by 3D audio in terms of music.
"I was making interactive music apps and looking at how the games industry was treating audio as an interactive experience," he says. "And it's so far ahead. I think definitely music as a medium has a lot of catch up to do. I think that there are some people exploring this area but thinking about purely musical experience in an interactive medium I think is really cool, and using 3D audio techniques and game engines to deliver it is interesting.
"You also have to think about how you structure the composition, because it's always a tricky problem in game audio where you want the music to adapt to the context of what the player is in. If it's written as a linear segment that you just somehow have to push into an interactive medium then it can be quite tricky, and you see the seams of how it works. But if you actually think about how to compose within an interactive medium, I think there's a lot more possibilities."
Gabe Cuzzillo's Ape Out and its reactive audio is a good example of what White is hoping to see more of, and a taste of what's to come in the world of game audio. Hendriks provides another example of possible music innovations that was explored during GameSoundCon this October.
"Wwise talked about their next version of [the software] and they gave an example of an unspecified game where music was used to play certain instruments in the world, so that it would draw your attention. The example was they really wanted the player to explore a certain cave so they just placed the brass section of the orchestra at the opening of that cave, and the rest of the orchestra would still play normally as 2D, which I thought was just an interesting example of using [3D audio].
"You don't need a 3D audio playback system to do that, but of course when you hear that over headphones in a fully spatialised world, that might be a very interesting experience. It might also be very weird, but I'm just looking forward to new approaches to using this tech."
These new approaches could include moving away from using generalised profiles for 3D audio, pushing the experience even further.
"Right now, it essentially uses an algorithm that generalises the way that your ears might be shaped to give you that impression of 3D audio over headphones," Hendriks explains. "But it will be much more effective if in the future you can actually get your own ears scanned and then have your own personal profile.
"That's maybe a few years from now, maybe a decade from now, but you could have 3D audio that's such high fidelity that it's almost indistinguishable from real world sound. I mean, that's a little bit out there, but that sort of path is possible and so in a way we're just at the beginning of this journey."