Here’s a post I’ve written some time ago on the Unity forums, detailing my ideas for Singularity. In it I make some generalisations and definitely leave out a lot of details and subtleties, but it makes a nice introduction to what I’m doing.
The State of Game Audio
The traditional sample-based approach to game audio is old and dated.
Over the course of the last two decades, game graphics have evolved from bitmap sprites to near photo-realistic imagery running at a solid 60 frames per second. We have shaders, massively parallel calculations running on dedicated hardware, and much more. With today’s and tomorrow’s hardware you can literally trace a ray of light as it bounces from surface to surface (and even through them!) towards the camera, creating crystal clear pictures with ever-increasing fidelity.
Some of these developments are slowly starting to transfer to game audio, but not nearly enough! Games across the entire spectrum, from AAA to Indie, still resort to using ancient sample-based approaches for audio. Middleware packages such as WWise or FMOD offer real-time effects processing, which is a step forward, but they don’t offer you the possibility to create your own synthesis model and generate sound from scratch on-the-fly. Furthermore, these packages seem to be mostly aimed at AAA first-person-shooter titles, making it difficult to do something radically different with them. And lastly, only the latter of those packages is available for use if you are a small-time developer.
This inhibits development of game audio as a more integral part of game design. The result is that audio in most games is still mostly, and sometimes even literally, an afterthought. In my opinion game audio is at least 10 years behind on game graphics, both in terms of technological capabilities and their usage.
Audio Design Process
The huge gap in technology means that audio development is a parallel process.
A typical game designer writes a design document with little attention to how the game is supposed to sound. With tools such as Unity, a studio can start prototyping a game within weeks, allowing very agile development methodologies. Sound designers and composers typically get none of these benefits. They are called in late in the process, get their assignment, produce the end result, and that’s it. Game developers embraced agile development a long time ago, while audio designers are still stuck with the waterfall approach. This is not by choice, but by necessity.
Game audio is still mostly linear content, applied in a non-linear context. Situations in games change continuously, and all the time! The audio and music in games should be able to instantly adapt to this.
All this means that audio is never central to a game design, while audio and music are actually a great area for innovation! There is a huge amount of creative potential here, completely untapped.
Laptop Performance & Live Coding
Stepping outside of the world of game development for a moment, let’s look at what musicians can do with technology.
The world of electronic music has plenty of tools that enable rapid, on-the-fly development of audio. Think of Ableton Live and Reason. There are also plenty of packages that enable a programmer’s approach to audio, such as Pure Data, Max/MSP, SuperCollider and ChucK. Performers can even take these tools on stage, start from a blank slate, and entertain crowds within minutes! That’s how powerful they are.
So I’m left to think: Why have we not integrated these tools into our game development environments?
Check some of these video’s out for an idea of what this technology can do:
(Long videos, scroll through them if you’re impatient )
Singularity – A flexible, real-time, general purpose audio engine
So I’ve been looking at my options for integrating that kind of technology directly into Unity.
My first thought was Pure Data. PD, with its visual programming paradigm is very easy to get into, and the software is quite mature. However, PD does not support object oriented programming, which means that its architecture does not map well onto a game engine.
Next I looked at ChucK. ChucK has great ideas about managing and playing with time, and its language contains very simple, yet very powerful semantics. ChucK’s implementation however is still very immature, resulting in very instable performance.
It appears SuperCollider is probably the most suitable for integration with Unity. Its language is well-defined, its implementation seems very robust and fast, and it is very feature-rich.
Regardless of the eventual implementation: The idea is that you get full control over the in-game audio.
SuperCollider is multiplatform, so builds for both Windows, Mac and Linux should be possible. Support for exotic platforms such as mobile devices and game consoles will likely require significant modifications to the SuperCollider sourcecode but should nevertheless be possible. SuperCollider uses a client-server model, meaning that the server could run alongside both a game and the editor, and unity could provide an integrated graphical front-end. Communication with the server is possible through Open Sound Control.
Several things need to be built:
- Open Sound Control implementation in dotNET/Mono
- Client-side (Unity) representations of server objects and their composition. These should provide easy to use access to the SuperCollider server, straight from Unity scripts.
These features should be optimised and introduce as little overhead and latency as possible.
- Fully featured SuperCollider client inside the Unity editor. This includes providing a code editor, a port of the sc_lang interpreter, and possibly a GUI for editing synthdefs.
- SuperCollider server compiled as dynamic link library for each supported platform, for complete integration into any game.
As a Unity Basic user this package would enable you to freely experiment with SuperCollider running externally. For Pro users, this would mean you can seamlessly integrate SuperCollider into your builds, while the end-user is non the wiser.
If you’re still with me, I’m very interested in hearing what you think! Already have an idea for a game using this technology? See some pitfalls in the implementation? Think all of this is nonsense? Do tell!
Modeling wind and airflow: For my current game, Volo, I need to model the complex noise you hear when your head moves through air at very high speeds (say, like sticking your head outside the window while driving at 80mph). The thing is, the resulting sound is entirely dependent on your head’s orientation in relation to the airflow! This is not something you can do effectively with samples, as even changing your orientation by a couple of degrees causes dramatic changes in the character of the sound.
Binaural mixing: Using filtering systems based on the human ear (HRTF), you can process sounds such that the signal contains spatial cues that the brain can understand. This results in highly realistic sound localisation which has to be heard to be believed: Virtual Barbershop Demo – Use Headphones
How about a music game where the music doesn’t remain static but actually changes as the player plays? I’ve recently done a project in which a player could use a Guitar Hero controller to actually play guitar, and produce feedback-fuelled Jimi Hendrix solos.