Delivering Higher Order Ambisonics on VLC 3 – the Cookbook

My last post talked much about production but little about delivery. That’s true of most current technical discussion, the difficulty of making the material is exhausting before even considering an audience. I did mention YouTube and Apple’s airbuds, but as reductions of the format, cut down versions. How can an audience hear the ‘real thing’? Is the audience going to care very much? Let’s see.

We need to target a freely available, cross platform, easy to use delivery tool – the obvious candidate is VLC which runs on near everything, costs nothing, and has millions of users. VLC is not quite for the casual user, but let’s be honest, the casual user Is not ready for ambisonics just yet. The enthusiast will be well served.

Documentation for VLC is woeful even for open source, and for ambisonics it’s woefully woeful. You have to scour the net for days picking up titbits – or read the guide I’m forming here.

Since version 3 VLC has been able to play third order (16 channel) ambisonics. This seems to be the work of Peter Stitt* but apart from that VLC has no documentation about it. If you create a 360º video for YouTube it will also play in VLC, so my first clue was that VideoLan probably expanded on Google’s specification. I then saw a mention on the Facebook Spatial Workstation page that Angelo Farina had hacked Google’s injection code so that it works for more than 1st order. But how do you make the 16 channel MP4 file in the first place?

I’ll skip a lot of pain here and jump to a post I found on VLC’s discussion area where a lone voice tried to get it working, completely ignored by everyone, especially VideoLan. I think this ‘drwig’ is likely Bruce Wiggins over at Uni of Derby UK. He confirmed the Farina patch and gave the magic incantation that builds a higher order video. It goes like this:

ffmpeg -loop 1 -i EQUIRECT.JPG -i "3rd Order ambix.wav" -map 1:a -map 0:v -c:a copy -channel_layout hexadecagonal -c:v libx264 -b:v 40000k -bufsize 40000k -tune stillimage -shortest "3rd Order Video".mov

This instructs FFMPEG, a command line audio-visual tool kit, to input a still image EQUIRECT.JPG and your audio file “3rd Order ambix.wav”. (The equirectangular image will be used as a visual compass to turn the sound field around in 360º. It needs to be twice as wide as it is tall.) Then to map the 16 mono channels of the audio in a hexadecagonal format. Then to render out the image looping as a video as long as the audio file into new combined movie called “3rd Order Video”.mov

The new video must then be injected with Farina’s patch. This runs only in version 2.7 of Python, which will need to be installed.

If you have performed the incantation properly the movie should load up into VLC on your work machine and sound noticeably better than the 1st order equivalent. In my test video a sound that was behind you in 1st order can now be heard to move around behind you. It really sounds like something worth the fuss with technology.

This took days on end. So I think it’s gorgeous.

NB this paragraph has been superseded please see new information below. (It’s all very inspiring but now a new set of problems has popped up. Because the video uses 16 mono WAV files to carry the sound, the files are enormous. They don’t stream at all well. The iPhone is not happy about it. My first solution was to try convert the WAV to AAC – but the Python injection refuses to see the audio track. Only WAV is acceptable… and I don’t know if that’s a decision or a technical limitation. But more concerning is that when trying out the file on various devices I heard different outcomes in VLC… the iPhone doesn’t sound like the PC. The Mac is more similar but seems to be bottom heavy. VLC has a large number of settings and preferences and if these are set differently you end up with large differences … or maybe (worst of all) the different builds solve problems in different ways…)

So that comes back to having to educate the audience. When we got our first colour TV back in the day there was a bit of tuning and a bit of calibration. But do audiences these days put up with that? The question I asked at the beginning is probably answered ‘no’ the mainstream audience is not going to go past Spotify (and even Tidal and SONY can’t fix that). We serve the enthusiast. We always did.

*UPDATES: Peter Stitt confirms he worked on the ambisonics rendering part of VLC, but that the file parsing was by the VLC team and is ‘a bit of a black art’. Advice from Angelo Farina is that AAC can only support 8 channels and so cannot be used for any higher order recordings. Some have suggested using Opus instead, which can have 255 channels – but all my experiments have failed to make a playable file (there is something about ‘families’ that is obscure. So there doesn’t appear to be any useful way to compress ambisonic audio.

I have instead made some videos in Second order surround (9 channels). These are not bad. I can hear difference but the end user probably won’t care. I also managed to make the video much smaller, and so it’s near acceptable.

To make VLC work the same on all platforms can be done reasonably quickly. If you have made any adjustments, reset the preferences (I know that’s severe but it’s effective.) Now just set the menu item Audio > Stereo Audio Mode > Headphones. Don’t set any other filters.

2 Comments

  1. Tom Ellard

    This is a test that comments are working.

Leave a Reply

Your email address will not be published. Required fields are marked *