|
|
- OpenAL Soft's renderer has advanced quite a bit since its start with panned
- stereo output. Among these advancements is support for surround sound output,
- using psychoacoustic modeling and more accurate plane wave reconstruction. The
- concepts in use may not be immediately obvious to people just getting into 3D
- audio, or people who only have more indirect experience through the use of 3D
- audio APIs, so this document aims to introduce the ideas and purpose of
- Ambisonics as used by OpenAL Soft.
-
-
- What Is It?
- ===========
-
- Originally developed in the 1970s by Michael Gerzon and a team others,
- Ambisonics was created as a means of recording and playing back 3D sound.
- Taking advantage of the way sound waves propogate, it is possible to record a
- fully 3D soundfield using as few as 4 channels (or even just 3, if you don't
- mind dropping down to 2 dimensions like many surround sound systems are). This
- representation is called B-Format. It was designed to handle audio independent
- of any specific speaker layout, so with a proper decoder the same recording can
- be played back on a variety of speaker setups, from quadraphonic and hexagonal
- to cubic and other periphonic (with height) layouts.
-
- Although it was developed decades ago, various factors held ambisonics back
- from really taking hold in the consumer market. However, given the solid
- theories backing it, as well as the potential and practical benefits on offer,
- it continued to be a topic of research over the years, with improvements being
- made over the original design. One of the improvements made is the use of
- Spherical Harmonics to increase the number of channels for greater spatial
- definition. Where the original 4-channel design is termed as "First-Order
- Ambisonics", or FOA, the increased channel count through the use of Spherical
- Harmonics is termed as "Higher-Order Ambisonics", or HOA. The details of higher
- order ambisonics are out of the scope of this document, but know that the added
- channels are still independent of any speaker layout, and aim to further
- improve the spatial detail for playback.
-
- Today, the processing power available on even low-end computers means real-time
- Ambisonics processing is possible. Not only can decoders be implemented in
- software, but so can encoders, synthesizing a soundfield using multiple panned
- sources, thus taking advantage of what ambisonics offers in a virtual audio
- environment.
-
-
- How Does It Help?
- =================
-
- Positional sound has come a long way from pan-pot stereo (aka pair-wise).
- Although useful at the time, the issues became readily apparent when trying to
- extend it for surround sound. Pan-pot doesn't work as well for depth (front-
- back) or vertical panning, it has a rather small "sweet spot" (the area the
- head needs to be in to perceive the sound in its intended direction), and it
- misses key distance-related details of sound waves.
-
- Ambisonics takes a different approach. It uses all available speakers to help
- localize a sound, and it also takes into account how the brain localizes low
- frequency sounds compared to high frequency ones -- a so-called psychoacoustic
- model. It may seem counter-intuitive (if a sound is coming from the front-left,
- surely just play it on the front-left speaker?), but to properly model a sound
- coming from where a speaker doesn't exist, more needs to be done to construct a
- proper sound wave that's perceived to come from the intended direction. Doing
- this creates a larger sweet spot, allowing the perceived sound direction to
- remain correct over a larger area around the center of the speakers.
-
- In addition, Ambisonics can encode the near-field effect of sounds, effectively
- capturing the sound distance. The near-field effect is a subtle low-frequency
- boost as a result of wave-front curvature, and properly compensating for this
- occuring with the output speakers (as well as emulating it with a synthesized
- soundfield) can create an improved sense of distance for sounds that move near
- or far.
-
-
- How Is It Used?
- ===============
-
- As a 3D audio API, OpenAL is tasked with playing 3D sound as best it can with
- the speaker setup the user has. Since the OpenAL API does not explicitly handle
- the output channel configuration, it has a lot of leeway in how to deal with
- the audio before it's played back for the user to hear. Consequently, OpenAL
- Soft (or any other OpenAL implementation that wishes to) can render using
- Ambisonics and decode the ambisonic mix for a high level of accuracy over what
- simple pan-pot could provide.
-
- When given an appropriate decoder configuration for the channel layout, the
- ambisonic mix can be decoded utilizing the benefits available to ambisonic
- processing, including frequency-dependent processing and near-field effects.
- Without a decoder configuration, the ambisonic mix can still be decoded for
- good stereo or surround sound output, although without near-field effects as
- there's no speaker distance information.
-
- In addition to surround sound output, Ambisonics also has benefits with stereo
- output. 2-channel UHJ is a stereo-compatible format that encodes some surround
- sound information using a wide-band 90-degree phase shift filter. This is
- generated by taking the ambisonic mix and deriving a front-stereo mix with
- with the rear sounds filtered in with it. Although the result is not as good as
- 3-channel (2D) B-Format, it has the distinct advantage of only using 2 channels
- and being compatible with stereo output. This means it will sound just fine
- when played as-is through a normal stereo device, or it may optionally be fed
- to a properly configured surround sound receiver which can extract the encoded
- information and restore some of the original surround sound signal.
-
-
- What Are Its Limitations?
- =========================
-
- As good as Ambisonics is, it's not a magic bullet that can overcome all
- problems. One of the bigger issues it has is dealing with irregular speaker
- setups, such as 5.1 surround sound. The problem mainly lies in the imbalanced
- speaker positioning -- there are three speakers within the front 60-degree area
- (meaning only 30-degree gaps in between each of the three speakers), while only
- two speakers cover the back 140-degree area, leaving 80-degree gaps on the
- sides. It should be noted that this problem is inherent to the speaker layout
- itself; there isn't much that can be done to get an optimal surround sound
- response, with ambisonics or not. It will do the best it can, but there are
- trade-offs between detail and accuracy.
-
- Another issue lies with HRTF. While it's certainly possible to play an
- ambisonic mix using HRTF and retain a sense of 3D sound, doing so with a high
- degree of spatial detail requires a fair amount of resources, in both memory
- and processing time. And even with it, mixing sounds with HRTF directly will
- still be better for positional accuracy.
|