… and Apple has finally come in for the kill.
Binaural sound is not new – it’s just one relic of the Modern Age where many grand ideas have bubbled up and burst. Hearing sound all around your head via headphones has been promoted since experiments in the 1930s. Some record labels have dipped hesitantly into the technology along with quadraphonic and 5.1, others like Naxos have consistently recorded live classical performances in binaural format, banking that their specialist audience are invested in the fine music experience. But most listeners have never been that interested, and quite happy to listen through single bluetooth speakers with ‘ultra bass’ slapped over the top. Or that infamous ‘car stereo’ experience every mixing engineer has to anticipate.
Part of the problem has been the headphones, particularly the nasty sort. As much as the 80’s was ‘wired for sound’, the headphones were about equal to mass produced cassettes in their awful reproduction.
The average listener wasn’t ready to spend hundreds of dollars on phones, or walk around in full sized cans looking like they were part of some bizarre mickey mouse club.
There’s been at least three recent changes that have brought 360º sound back into contention. Firstly, the smart phone acting as the user’s primary media source. That’s where all your music and your movies live, not the cinema or the CD shelf. Secondly there’s computer assisted headphones and buds where the headgear is both discrete and quality. Wearing earbuds is now high status, and even comfortable. Third is the development of cinematic object-based audio such as ATMOS, at a time where cinema is shrinking down to the individual viewer.
Where a few years ago the idea was obscure, it’s now given. Even Netflix is sending out surround in the form of Dolby ATMOS. But how many people really have space for a physical home cinema? The phone, the phones and the cinema have combined. We’ve reached a moment where a bleeding edge idea has become mainstream – and that’s the borderland where Apple always pounces.
Rather than attempt a new kind of listening, they are selling their surround system as part of the well established home theatre experience. The screen (a phone or pad) is always the focus of the sound, which follows the general Dolby™ law that everything starts ‘in front’ of the viewer and extrudes out the ‘surround’*. This focus requires some insanely clever hardware – both the screen and the earbuds have gyroscopes in them that constantly calculate ‘front’ and ‘back’. Turn your head and the sound keeps aligned.
This is both excellent and annoying. Half the trouble with binaural sound is that we constantly turn our head to catch sound at different angles, it’s part of working out the direction. Normal headphones just rotate the universe with your head, hurting the effect. Some fancy headphones have a tracker in them – I have these but most of humanity does not. YouTube rotates the sound as you point your phone around the space, which works great but isn’t instinctive. Apple’s Pro Airpods are the first mass market headset that gives you the instinctive result. That’s excellent.
What’s a bit annoying is that this 5.1 and 7.1 front-back alignment isn’t really about recorded music or sound. It’s about soundtrack. It’s about sitting. Everything is aligned with the image on the phone, which is Dolby’s conceptual meat and potatoes. The recommended listening position. To be fair you can create surround music using only 3D objects in ATMOS, but there is an established cinema practice in place with a dialogue speaker front and centre. And if I put my phone down and walk, do I want the music to come from the phone or to follow me? What if I want that to be the case for other people listening to my music? It’s OK to have front and back, but as we’ve been making surround for about 90 years now, there’s the possibility of losing some of that learning.
It’s interesting to compare with SONY’s 360º Reality Audio. This is based on the fields that SONY can dominate – music and game sound. The Playstation 5 will have this format, as will a handful of 2nd tier music streaming servers including Tidal. Game sound requirements are quite different to Apple’s system – the game world is not aligned with the physical screen which is only a window to a virtual space, and sound sources are equally important in all directions. A threat behind your head can’t be met just by turning your head (unless you’re using VR which gets less likely every year).
I’m pretty confident that Apple will dominate this field as they have many others. They’ve identified a format that is well known and has a lot of people already used to producing it. I can set up a 5.1 system easily, ATMOS a little less easily. While SONY is basing their format on open source MPEG-H, their tools are currently obscure and on request only. SONY has a huge range of headphones – but none are gyroscopic as far as I know. It won’t take long for that to change.
But we should take care not to allow ‘a solution’ to become ‘the solution’ for surround music. Like any technology it comes ready to be challenged, misused and patched around.
* ATMOS has two main levels – the first being the ‘bed’ – 10 channels of multi-track sound 7:1:2. Then there are 128 channels of sound ‘objects’ animated in 3D space. These layers are combined and flattened to fit the number of speakers available. With headphones the sound is rendered out using a head transfer function (the sounds appear to reach the ear as if sent by virtual speakers). The ‘bed’ is like previous Dolby sound set-ups in having a set mono dialogue speaker, a front stereo pair and then extra surround channels. Objects can be placed as you wish, but the usual expectation is that they align with the ‘bed’.