AmbiCreator – producing Audio for VR and 360° content with the OC818
AmbiCreator – producing Audio for VR and 360° content with the OC818
by Christoph Frank
This is my second blog post for Austrian Audio, but this time it will not be about hardware development but software instead.
At this year’s AES show we will release our new plugin, called the AmbiCreator, which can be used with a pair of OC818 microphones to create Audio for VR and 360° videos.
We wanted to showcase it at the fair show together with our headphones, but as these are special times for all of us here is the “release show” as a blog post.
So, lights off, spots on…
What is ambisonic?
When we talk about audio for VR and 360° videos, we need to talk about ambisonics.
There is a ton of information about ambisonics on the internet – and whereas scientific papers like e.g. in the AES archives show the great development in this field, most of what you find online is just copy and paste of the same graph showing the balloon plots of the different orders of ambisonics: I bet the graph comes as the first picture if you google ambisonics.
However, as I have the feeling that this doesn’t help most people in the audio industry let me try to explain it in more hands-on audio terms.
My dear friends Franz Zotter and Mathias Frank from the Institute of Electronic Music, IEM Graz, also wrote a great book where I can especially recommend the first chapter – have a read, it’s for free!
As you probably already know, every polar pattern we normally use (figure-of-eight, cardioid, omni and everything in between) can be achieved by a combination of a figure-of-eight and an omni.
We also call this a mixture of the pressure (omni) and pressure-gradient (figure-of-eight) component.
This is actually also what you do when designing a microphone capsule – you try to achieve the appropriate combination of a “sealed” capsule (omni) and an open “figure-of-eight” capsule (where sound can also travel to the backside of the membrane) by finding the right acoustic resistance to balance them.
The directional characteristics, e.g. a cadioid pattern, we can achieve by this combination can only face in one direction – so a variation is only possible along one axis.
David Josephson explains nicely how this is done for his C700 series microphones here.
He also explains how a cardioid which can “look” around a full 360° angle on the horizontal plane when two figure-of-eight patterns, angled 90° towards each other, are combined with an omni characteristic.
Shoeps’ Double-M/S microphone and plugin also uses this approach although here the second figure-of-eight and the omni are already superseded by two cardioids – so three microphones in total.
If we now bring in a third figure-of-eight pattern to the equation, we can actually calculate a cardioid which can “look” around in every angle both on the horizontal and vertical plane or any combination.
Spoiler-Alarm: Below you can see how a Super-Cardioid calculated via our new AmbiCreator Plugin “looks” around a full circle in the vertical plane.
And now we get to the point where, damn, even I also need to re-post the famous ambisonics graph (with permission from Mathis Frank – no, we are not related, he’s far more clever than me…)
Figure from Ambisonics, Franz Zotter and Matthias Frank, Springer Verlag (2019)
The first row shows and omni characteristic, which is referred as zero order ambisonics.
The second row is the 1st order ambisonics with the three figure-of-eight patterns all along a different axis. Together with the omni we can generate all the patterns we are used to like super-,hyper- and wide-cardiod which are referred as 1st order patterns.
The other two rows shown are 2nd and 3rd order ambisonics which basically make narrower characteristics than cardioid possible to achieve better localisation, more on that later. For these you need special microphone arrays where companies like Zylia kick in, go here.
Is ambisonics the same as binaural / 8D audio?
There is a hype going on lately called “8D” – it stands for 8 Dimensions and has this certain “wow” effect when you listen to it on headphones. And this is the point – it works on headphones and headphones only, the same is true for binaural audio.
Both are essentially stereo files where audio was either recorded or rendered with HRTFs to achieve an extremely realistic localisation and immersive (damn, now I also said the i-word…) experience.
These HRTFs are actually different for each individual, so how well this works depends if your ear fits to the artificial head which was used for recording or the HRTFs which were used for processing the 8D music to give the impression as if a certain sound is coming from a specific direction.
But they mostly fit well enough end yield in an impressive result.
However, coming back to the initial question – NO, this has nothing to do with ambisonics!
A 1st order ambisonics audio stream, which is also the 3D audio definition available within youtube, is basically a 4 channel file containing the signals of the bespoke trio of figure-of-eight and the omni pattern.
Imagine you really use such 4 microphones to record in the middle of an essemble, the ambisonics audio file would simply be those 4 raw tracks – this is called B-format.
This file on its own is relatively useless – you will need a certain de-coder to get material for our usual playback setups out of it, which are:
Stereo: a simple X/Y stereophony signal could be calculated from a B-format stream…
Surround: Signals for 5.1, 7.1 or any other horizontal speaker configuration can be calculated out of it
These two would only use the two figure-of-eight patterns in the horizontal plane
If we also use the vertical figure-of-eight, newer audio formats can be the output of a B-format decoder, like
Surround including height channels: speaker setups like 11.2 (which is 7.1 with a second subwoofer and 4 height channels above the listener) can be calculated out of it
Binaural: with a certain set of HRTFs, a binaural stereo signal for playback on headphones can be calculated
Wait – didn’t I just say binaural has nothing to do with ambisonics?
Yes, because binaural is just one of the many “output-formats” we can gain from an ambisonic-stream. For this, a bunch of directional patterns (mostly super- or hyper-cardioids) facing in a certain set of angles are calculated – then these signals are processed with the correspondent HRTF for this direction and summed up to one stereo file.
More sophisticated binaural encoders use some tricks to improve localisation (like the one developed by the IEM which I recommend) – however, the localisation can never be as good as a true binaural recoding as the 1st order patterns are rather broad.
On the other hand, if you listen to a true binaural recording (like the famous virtual barber shop and you turn your head, nothing will change as this is a static recording.
But if you en-code an ambisonics stream to binaural, the sound sources will move with the rotation as they can be recalculated in real time if you have some sort of head tracking available. The nice thing is that we have all such a device in our pocket!
A mobile phone has several gyro sensors and a compass inside and can do that processing for you.
So a true binaural recoding can’t be used for VR or 360° videos as the sound cannot move with the viewer looking around.
A-format vs. B-format
If you ever looked up for ambisonic microphones, you will have probably read about A-format rather than B-format.
This is because a microphone array consisting of three figure-of-eight capsules and one omni is rather hard to realise, so the alternative is a microphone consisting of 4 cardioid capsules where all our needed patterns can be superseded by those.
Besides these special A-format microphones (starting at around 1000€), you will need a plugin which calculated the B-format stream out of the A-format signal and voila – you have your ambisonics signal commonly accepted by most DAWs and platforms like Youtube.
Shouldn’t this be a release show rather than a scientific monologue?
Okay – now I told you everything I know about ambisonics (don’t forget to read this book to REALLY know about ambisonics) we get to the fun part – recoding ambisonics with our OC818 microphones.
As you know, all our OC818 microphones are calibrated to have the exact same sensitivity, which make them ideal for stereo recordings. With our PolarDesigner plugin, you can even experiment with different polar patterns in post-production to find the best stereo setup for your recording.
With our newest plugin AmbiCreator we go one step further – getting an ambisonic B-format signal out of two OC818 microphones, e.g. a Live-Set.
As our OC818 microphones have a dual output, we can calculate any 1st order pattern (now you know what the scientific chitchat was good for) from those two outputs.
If you really read so far, you will already know that we need three figure-of-eight patterns and the omni signal for this.
Omni and figure-of-eight in the horizontal are easy and no need to explain, right? But what about the figure-of-eight in the vertical direction?
Well, we use a “trick” here which relates to the very common endfire beamforming technique used in telecom and automotive.
Thomas, our plugin programmer, and I wrote an eBrief for the AES show which you can read here, if you want to know the details. It will basically tell you how this is achieved and show that the vertical axis “only” works up two around 2kHz – however as most of music’s and voice’s content is in this range, it actually works quite well for all recoding situations!
To demonstrate the effect, we did some test recordings with some of Vienna’s finest musicians and a drummer (Sorry Walter for that joke!) in the Porgy&Bess jazz club.
This nice location gave us the possibility to place the musicians also at an vertical angle (e.g. the great Ulrich Drechsler was placed on the balcony far behind the recording rig) – further, they were also “open” during these difficult times as they are hosting their great “the show must go (on)line” series of streamed live performances. Make sure to check it out and leave them a tip every Thursday and Saturday!
The two OC818 microphones were placed on top of each other close to the 360° camera – no additional microphones were used, instruments were balanced by moving musicians back and forth (old-school techniques meet 21st century technologies if you will). Besides the plugin-processing, no additional EQ or effects were used.
So if you have some sort of VR-glasses (even google cardboard works fine) and a pair of headphones, here are our first two recordings:
As said before, who well the localization works for you is dependent of your ears (in comparison to the set of HRTFs Youtube is using) and even on your headphones.
If everything fits well you should e.g. be able localize all the musicians introducing themselves in the second example:
I want this – what do I need to do?
So, if you want to experiment with ambisonics or are even confronted with this by a client but don’t want to buy and expensive and specialized A-format microphone, here’s what you need to do.
- First, unpack your OC818 live set (you don’t have one yet? Buy one!)
- Set it up as shown in the graph below and record both mics in dual output mode to a 4-channel track:
Channel1: Front = main XLR of lower mic
Channel2: Back = mini-XLR of lower mic
Channel3: Left = mini-XLR of upper mic
Channel4: Right = main XLR of upper mic
In the plugin window, there’s not much you need to do – if you see a signal on all four input and output level meters, you should already have your B-format stream ready and can de-code it with the tools of your DAW or e.g. the IEM plugin suite.
Besides the total output gain you can also adjust the amount of z-, so vertical information – at best, leave it to 0dB. Horizontal orientation makes it possible to “virtually turn your listening head” around if you don’t have any head tracking available in your DAW.
So, this is the end of our “release show” – I imagine the applause and hope you like our work! Be sure that this won’t be our last excursion into the world of ambisonics as this already is and will be an even bigger field for us audio engineers!
Last word: Some viewers might have seen the mic stands visible at the piano but I said no additional microphones were used – well, let’s say we used to possibility to record this nice Fazioli piano to test one of our upcoming products…