Recently, I am working a lot on my dissertation topic, trying to narrow down my ideas and define the focus of the research. One thing I know for sure – I will focus on audio perception in Augmented Reality systems. But before I can start any experiments, I have to find the tools which will allow me to create the audio rendering system. This task is challenging. The goal of AR audio system is to create a virtual sound which will mimic the characteristics of real sounds and blend seamlessly with the real environment in which the user is situated. Besides that, it also has to model the spatial hearing characteristics of the listener by allowing for the positioning of the sound source in any direction in relation to the user (I explained this topic in more details in the previous post).
Modeling of the virtual sound, first of all, entails generating the sound through sound synthesis, or by reproducing a previously recorded sound. Later, its directivity pattern needs to be modeled. Directivity patterns basically describe how much sound energy emanates from the source in different directions – i.e., a voice is louder in front of the head than from the back.
In addition, modeling of the space around the user usually also means modeling the reflections which occur in any enclosed space. Sound waves bounce off all of the surfaces in a room, creating reverb – the unique acoustic fingerprint of the space. When positioning the virtual sound in real space, the reflections of the room need to match with environment the user is in. There are several ways of modeling room acoustics. Most common is using geometrical acoustics – where, first, the 3D model is created and, second, the behavior of sound waves in the space is calculated. The problem with this method is that it requires a lot of computational power for high accuracy. Another way, however, is to measure the impulse response of the room (record all of the reflections and then convolve them with the sound signal(s)), but in order to implement that effectively, a lot of measurements are required.
There are clearly still many technological obstacles which need to be solved to create a commercial product delivering seamless user experience of AR audio.
The question I am very much interested in, then, is – what are the minimum requirements for the rendering system to achieve a realistic user experience for a given application?
Image source: BGR.com