antkeeper
/
superbuild

OpenAL Soft's renderer has advanced quite a bit since its start with pannedstereo output. Among these advancements is support for surround sound output,using psychoacoustic modeling and more accurate plane wave reconstruction. Theconcepts in use may not be immediately obvious to people just getting into 3Daudio, or people who only have more indirect experience through the use of 3Daudio APIs, so this document aims to introduce the ideas and purpose ofAmbisonics as used by OpenAL Soft.

What Is It?===========
Originally developed in the 1970s by Michael Gerzon and a team others,Ambisonics was created as a means of recording and playing back 3D sound.Taking advantage of the way sound waves propogate, it is possible to record afully 3D soundfield using as few as 4 channels (or even just 3, if you don'tmind dropping down to 2 dimensions like many surround sound systems are). Thisrepresentation is called B-Format. It was designed to handle audio independentof any specific speaker layout, so with a proper decoder the same recording canbe played back on a variety of speaker setups, from quadraphonic and hexagonalto cubic and other periphonic (with height) layouts.
Although it was developed decades ago, various factors held ambisonics backfrom really taking hold in the consumer market. However, given the solidtheories backing it, as well as the potential and practical benefits on offer,it continued to be a topic of research over the years, with improvements beingmade over the original design. One of the improvements made is the use ofSpherical Harmonics to increase the number of channels for greater spatialdefinition. Where the original 4-channel design is termed as "First-OrderAmbisonics", or FOA, the increased channel count through the use of SphericalHarmonics is termed as "Higher-Order Ambisonics", or HOA. The details of higherorder ambisonics are out of the scope of this document, but know that the addedchannels are still independent of any speaker layout, and aim to furtherimprove the spatial detail for playback.
Today, the processing power available on even low-end computers means real-timeAmbisonics processing is possible. Not only can decoders be implemented insoftware, but so can encoders, synthesizing a soundfield using multiple pannedsources, thus taking advantage of what ambisonics offers in a virtual audioenvironment.

How Does It Help?=================
Positional sound has come a long way from pan-pot stereo (aka pair-wise).Although useful at the time, the issues became readily apparent when trying toextend it for surround sound. Pan-pot doesn't work as well for depth (front-back) or vertical panning, it has a rather small "sweet spot" (the area thehead needs to be in to perceive the sound in its intended direction), and itmisses key distance-related details of sound waves.
Ambisonics takes a different approach. It uses all available speakers to helplocalize a sound, and it also takes into account how the brain localizes lowfrequency sounds compared to high frequency ones -- a so-called psychoacousticmodel. It may seem counter-intuitive (if a sound is coming from the front-left,surely just play it on the front-left speaker?), but to properly model a soundcoming from where a speaker doesn't exist, more needs to be done to construct aproper sound wave that's perceived to come from the intended direction. Doingthis creates a larger sweet spot, allowing the perceived sound direction toremain correct over a larger area around the center of the speakers.
In addition, Ambisonics can encode the near-field effect of sounds, effectivelycapturing the sound distance. The near-field effect is a subtle low-frequencyboost as a result of wave-front curvature, and properly compensating for thisoccuring with the output speakers (as well as emulating it with a synthesizedsoundfield) can create an improved sense of distance for sounds that move nearor far.

How Is It Used?===============
As a 3D audio API, OpenAL is tasked with playing 3D sound as best it can withthe speaker setup the user has. Since the OpenAL API does not explicitly handlethe output channel configuration, it has a lot of leeway in how to deal withthe audio before it's played back for the user to hear. Consequently, OpenALSoft (or any other OpenAL implementation that wishes to) can render usingAmbisonics and decode the ambisonic mix for a high level of accuracy over whatsimple pan-pot could provide.
When given an appropriate decoder configuration for the channel layout, theambisonic mix can be decoded utilizing the benefits available to ambisonicprocessing, including frequency-dependent processing and near-field effects.Without a decoder configuration, the ambisonic mix can still be decoded forgood stereo or surround sound output, although without near-field effects asthere's no speaker distance information.
In addition to surround sound output, Ambisonics also has benefits with stereooutput. 2-channel UHJ is a stereo-compatible format that encodes some surroundsound information using a wide-band 90-degree phase shift filter. This isgenerated by taking the ambisonic mix and deriving a front-stereo mix withwith the rear sounds filtered in with it. Although the result is not as good as3-channel (2D) B-Format, it has the distinct advantage of only using 2 channelsand being compatible with stereo output. This means it will sound just finewhen played as-is through a normal stereo device, or it may optionally be fedto a properly configured surround sound receiver which can extract the encodedinformation and restore some of the original surround sound signal.

What Are Its Limitations?=========================
As good as Ambisonics is, it's not a magic bullet that can overcome allproblems. One of the bigger issues it has is dealing with irregular speakersetups, such as 5.1 surround sound. The problem mainly lies in the imbalancedspeaker positioning -- there are three speakers within the front 60-degree area(meaning only 30-degree gaps in between each of the three speakers), while onlytwo speakers cover the back 140-degree area, leaving 80-degree gaps on thesides. It should be noted that this problem is inherent to the speaker layoutitself; there isn't much that can be done to get an optimal surround soundresponse, with ambisonics or not. It will do the best it can, but there aretrade-offs between detail and accuracy.
Another issue lies with HRTF. While it's certainly possible to play anambisonic mix using HRTF and retain a sense of 3D sound, doing so with a highdegree of spatial detail requires a fair amount of resources, in both memoryand processing time. And even with it, mixing sounds with HRTF directly willstill be better for positional accuracy.