An efficient AR sound experience for mobile apps

Semester project for the Sound and Music Computing master at AAU (2016)

Andrea Corcuera Marruffo, Jose Luis Diez-Antich, Matteo Girardi, Nikolaj Kynde, Mattia Patterna, Lars Schalkwijk

Aalborg University

Copenhagen, Denmark

2015 

The computational power of mobile devices has highly increased in the last few years and today almost every device is equipped with a Global Positioning System (GPS) and compass sensor. These facilities open up possibilities to enhance the user experience in daily life. In this paper an application for mobile devices that uses an efficient head related transfer function (HRTF) model to create 3-D soundscapes is presented. In a small experiment the developed 3-D audio engine is compared with a cosine panner model in terms of quality and efficiency of the navigational cues. Although the experiment did not reveal significant differences between the two models a critical observation of this study supports that a more sophisticated 3-D audio engine can increase the user experience in audio navigation 


Several mobile device applications for the engine presented can be imagined. For instance, an audio-navigation system for visually impaired people, with which the users could move around using the auditory cues provided by the application. Simple navigation-systems for either city tour guides or event guides in a way to promote events in places like concert halls, pubs, or even restaurants could be implemented using the 3-D audio engine. In addition, other leisure-oriented uses could be, for example, an audio geocaching application, where users find ‘treasures’ hidden outdoors using their hearing. 

The audio engine is the core part of the mobile application. Based on the HRTF model introduced in [5] is efficient for real time applications. The audio engine is the core part of the mobile application. First, a monaural sound source is processed by a distance model, which will feed a head model and a room model in parallel. The head model stereo output is input to a pinna echoes model, whose output is added to the output of the room model to obtain the spatialized version of the sound. To implement the head shadow block of the HRTF model, a Pure Data external called headShadow was written in the C programming language. To go through this source code development, it was necessary to get the correct difference equation of the head shadow block. Except for the Pure Data external which was implemented in C, the HRTF model was implemented completely using Pure Data. 


REFERENCES

[1] A. Farnell, Designing Sound. The MIT Press, 2010.

[2] D. Begault, 3-D Sound for Virtual Reality and Multimedia. AP Professional, 1994.

[3] J. Breebaart and C. Faller, Spatial Audio Processing: MPEG Surround and Other Applications. Wiley-Interscience, 2007.

[4] A. Meshram, R. Mehra, and D. Manocha, “Efficient hrtf computation using adaptive rectangular decomposition,” in Audio Engineering Society Conference: 55th International Conference: Spatial Audio, Aug 2014.

[5] C. Brown and R. Duda, “An efficient hrtf model for 3-d sound,” in Applications of Signal Processing to Audio and Acoustics, 1997. 1997 IEEE ASSP Workshop on, pp. 4 pp.–, Oct 1997.

[6] A. H˜arm¨a, J. Jakka, and et Al., “Augmented reality audio for mobile and wearable appliances,” J. Audio Eng. Soc., vol. 52, pp. 618–639, June 2004.

[7] Y. V. Alvarez, I. Oakley, and S. A. Brewster, “Auditory display design for exploration in mobile audio-augmented reality,” Pers Ubiquit Comput, vol. 6, pp. 987–999, 2011.

[8] F. Rumsey, Spatial Audio. Focal Press, 2001.

[9] R. P. Tame, D. Barchiese, and A. Klapuri, “Headphone virtualization: Improved localization and externalization of non-individualized hrtfs by cluster analysis,” in Audio Engineering Society Convention 133, Oct 2012.