The 3D Future of Sound

Published April 22, 2015 in Backchannel.

I can’t believe my ears. I’m waving a plain grey cube back and forth around my head, and I could swear the instrumental music I’m hearing is coming from that cube, as if it were a tiny speaker. Brash violins overwhelm my left ear as I bring the cube close to it, and then fade to a more manageable volume as I move it further away. I hold the cube behind me, and it feels like I just turned my back to a string quartet. When I move the cube over my head, my scalp prickles as the soaring string instrumentals play from above. But the cube isn’t actually producing a single sound. I’m hearing the music through headphones, and the moment I take them off the illusion is over — the cube is silent.

I’m standing in a large room in the University of Maryland’s tech incubator building, where the eight-person startup VisiSonics is based. It’s their software trickery that’s fooled my brain into hearing 3D sound over headphones. The grey cube I’m holding is connected to a device that tracks its position as I wave it around. VisiSonics’ RealSpace 3D technology uses spatial data from the tracker to process the music so it sounds as if it is emanating from the cube. This particular demo is just a proof of concept, demonstrating the software’s ability to fool your brain. But the most striking thing about it is how easily it makes me forget that I’m wearing headphones at all.

That’s just the effect VisiSonics is hoping for. The company aims to become the default audio engine for games and for virtual or augmented reality. It’s already got one big win to its credit. Facebook’s Oculus VR, maker of the Oculus Rift virtual reality headset, licensed the RealSpace 3D technology last October and recently incorporated it into Oculus’s audio software development kit. VisiSonics is in talks with other companies, as well. “The goal of well-executed 3D audio is, essentially, even though you’re hearing things over headphones, it’s as if those headphones aren’t there,” says Ramani Duraiswami, one of the company’s founders and a professor of computer science at the University of Maryland in College Park, Maryland.

But the company’s ambitions go beyond gaming and virtual reality. “We want to be there wherever people hear sounds over headphones, and change that experience,” says Duraiswami.

We take our ability to hear sound in three dimensions for granted — even with eyes closed, we can still pinpoint where a sound is coming from. Our brains are incredibly good at interpreting the sound that enters our ears, which act like two spatially separated microphones. That means that the brain uses multiple cues, such as the delay between when a sound reaches our left and right ears, to get an idea of where a sound is coming from. Replicating that effect through speakers or headphones is not easy.

One way to create a 3D sound experience is to just place speakers all around you. That’s how surround sound works, and what some movie theaters do, but it doesn’t work with headphones. And the rise of mobile devices means that many of us play games, listen to music and watch movies while using headphones. VisiSonics’ software can deliver 3D sound through any headphones, on any device. Imagine listening to Beethoven’s 9th symphony on your phone and feeling as if you were at a live performance with the best seat in the house. Or watching a horror movie on an iPad and hearing the villain’s footsteps slowly creep up behind you.Audio engineers have tried to produce such 3D sound experiences for more than a century. For most of that history, they have typically used a dummy head with a microphone in each ear to record audio, in an attempt to capture sound the same way it would be heard by our ears.

VisiSonics has found a way to get rid of the dummy head and use software, instead. One way we locate the source of sounds is by turning our heads and seeing how that changes what we hear. For virtual or augmented reality to be truly immersive, not only do you have to be able to pinpoint where a sound is coming from, but that sound has to change realistically based on how you move your head. VisiSonics’ software calculates in real-time how a sound should change when we turn our heads, thus maintaining the sense of immersion.

The software also accounts for the features of a room, which determine how a sound bounces around a room before it reaches you. You hear not just the original sound but also its many reflections off the walls, furniture and other features of the room. These reverberations are particularly useful in determining the distance between a listener and the source of a sound, and in conveying a sense of space. Part of what makes RealSpace 3D sound so real, Duraiswami says, is the accuracy with which the software replicates reverberations.

The third crucial component of Duraiswami’s software is its ability to mimic the way sound from a particular direction interacts with your body before entering the ear canal. The direction a sound is coming from changes how it reflects off a person’s head, their torso, as well as the dips and curves of the outer ear. Scientists have characterized these effects with the aptly named head-related transfer function, or HRTF. If you want a sound to seem as if it is coming from a particular direction, you have to apply the appropriate HRTF.

As far as my brain is concerned, the VisiSonics-processed sound coming out of my headphones might as well have come from the grey cube I’m holding. Three-dimensional audio is “a three-legged stool,” says Duraiswami. “If any one leg is gone, the effect is broken.”

With these advances in 3D audio, any music, movie or game can now be processed to make it seem as if you’re hearing sounds in an open field, or a cozy living room, or a concert hall. Music and movies might even be created with 3D audio in mind from the start. For instance, Bjork just released a 360 degree music video with 3D sound (powered by one of VisiSonics’ competitors, with their own take on 3D audio), which she described on Facebook as making listeners feel “as if you are on that beach and with the 30 players sitting in a circle tightly around you.” VR movie studios that are trying to make immersive 360-degree movies, where the action takes place all around you, will need such technology to be truly convincing.

The visual side of virtual reality has made huge strides in the past decade, and audio is only now catching up — just in time, as several VR companies have recently either released or announced plans to release new headsets. “You can’t have just half the immersion,” Duraiswami says.

Another RealSpace 3D demo, this one created by an independent developer using software plugins released by VisiSonics, helps me understand why Oculus wants 3D audio for virtual reality. It’s my first time trying a virtual reality headset, and my first impulse is to just soak in the visuals. I’m standing on a wooden walkway over water teeming with fish, tropical foliage all around. I turn my head to take in the lush graphics of this world, a sensation familiar from exploring the rich environments of countless video game. What’s different this time is the impression of being physically present inside a video game, a sensation that owes a lot to the expansiveness of RealSpace 3D’s sound. I feel like I’m out in the open, under a night sky, with crickets chirping in the background and water gushing out of a pipe to my right. As I move my head to take in all the sights, the sounds shift just the way I expect them to, preserving the sense of immersion.

VisiSonics is also trying to bring more immersive audio to the gaming world: the company imagines players able to pinpoint enemies or find their way using their ears alone. The team has released a RealSpace 3D plugin for Unity, a major game development platform, and is working on incorporating their software into other game engines. Because it could be used on anything from an iPhone to a high-end computer, VisiSonics’ software tries to deliver the best audio experience possible based on the computing power available to it.

VisiSonics isn’t the only company to realize there’s a market for 3D sound. Its rivals include Impulsonic and Two Big Ears, both small, young startups. Thrive Audio, another startup that produces realistic audio for virtual reality, was recently snapped up by Google, a sign of the increased interest in this space. Rod Haxton, VisiSonics’ lead software engineer, says competition is healthy, and could help legitimize 3D sound. “What’s going to separate somebody from choosing them or choosing us is quality of the sound, how easy it is to use, and whether or not it’s compute-intensive,” he says.

Haxton attributes the success of VisiSonics’ software to the research conducted by Duraiswami and Dmitry Zotkin, a computer scientist at the University of Maryland who also works part-time at VisiSonics. “It’s the genius of Dmitry and Ramani,” says Haxton. “There’s some pixie dust going on that others just haven’t gotten to yet.

 Pixie dust or not, it took Duraiswami and Zotkin more than 10 years to perfect the RealSpace 3D technology, and even more legwork to commercialize it. It all began with a project to help newly blinded soldiers.

1-yi5ouJ1eeTrBq5-oeVaIzg

VisiSonics was originally founded to produce and sell their “audio camera,” shown here hooked up to a laptop. Photo by Sandeep Ravindran

Duraiswami joined the University of Maryland in 1998 and decided to use his expertise in physics and engineering to study audio. In one early project, he sought to create a simulated audio environment for soldiers with damaged vision. The goal was to give the soldiers a way to virtually experience the sounds of a street corner or office, say, so that they could learn to navigate using these sounds before they went out into the real world. Duraiswami developed an “audio camera” as well as software — later dubbed RealSpace 3D — to produce audio environments over headphones.

Duraiswami and graduate student Adam O’Donovan founded VisiSonics four years ago, initially to produce and sell the audio camera. Soon they realized their sound-rendering technology could be applied in many other ways. But they ran into a hurdle — how to get users and companies to actually adopt it. “They say if you build a better mousetrap, the world will come,” says Duraiswami. “But that’s only part of it.”

Although their technology could have profound effects on music or movies, these industries can be hard to break into. “That’s why we think VR is very important,” Duraiswami says. “It’s a completely new way of doing things.”

It was clear to Duraiswami from the beginning that his technology worked well. But how was a tiny company, based far from Silicon Valley, to get noticed by any of the giants of virtual reality? “Oculus was on our radar for a bit,” says Duraiswami. “We knew virtual reality was an application, and our CEO, Gregg Wilkes, was basically pursuing them.” It took more than a year for those efforts to come to fruition.

Haxton, the lead software engineer, joined VisiSonics in 2012 after a long career spent mostly in the games industry. Excited by the potential of 3D audio in games, he cobbled together a crude demo in which players controlled a giant bunny. But when the time came to show the demo at a conference featuring a lot of serious military simulations, he replaced the bunny with a gun-toting protagonist surrounded by futuristic tanks and helicopters — much more fitting, though it did include a robot bustling around playing MC Hammer’s “U Can’t Touch This.”

Duraiswami and his colleagues took their demos to various conferences, but they got little traction until they attended the 2014 Game Developer’s Conference in San Francisco. They didn’t have a booth at the show, but Haxton patrolled the show floor wearing a T-shirt with “Ask me for a demo of true 3D game audio” emblazoned on the back. Meanwhile, Duraiswami and Wilkes commandeered some unused tables and tried to find Oculus VR employees to show their demos to. “We essentially waylaid those guys,” says Duraiswami. When some of the Oculus co-founders tried out VisiSonics’ software, they liked it enough to begin working out a licensing deal.

“That’s how we got in — just through guerrilla tactics and trying to grab whoever we could to hear our stuff,” Haxton says. The team’s success at the trade show was a major break, he says.

1-hMkU9M9WzmPlVTjHHJeSSQ

Ramani Duraiswami (left) and Dmitry Zotkin (center) have worked on audio for more than 10 years, and continue to use their backgrounds in physics and engineering to improve the accuracy of the RealSpace 3D technology. Also pictured is Ph.D. student Yuancheng Luo (right), who has since graduated and now works at VisiSonics. Photo by John T. Consoli

To make 3D audio even more lifelike, Duraiswami is working to measure each person’s head-related transfer function. At the moment, RealSpace 3D uses generic HRTFs that work well but don’t account for the differences in the shapes and sizes of our ears and heads, which in turn affect our perceptions of sounds.

Measuring HRTFs typically takes two to three hours, but Duraiswami has been whittling it down to tens of seconds. “Our goal is to be able to personalize every person’s music or audio experience with their personal HRTF,” he says.

With this and other advancements, both 3D sound and virtual reality promise to keep getting better. “Ten or 20 years from now, when people look back at the beginning, it’ll almost be like us looking back at Atari 8-bit games,” says Haxton. “VR is taking off again, and small little VisiSonics out of College Park, Maryland is along for this ride.”