Robust Detection of Environmental Sounds in Binaural Auditory Scenes

Ivo Trowitzsch; Johannes Mohr; Youssef Kashef; Klaus Obermayer

doi:10.1109/TASLP.2017.2690573

Robust Detection of Environmental Sounds in Binaural Auditory Scenes

Trowitzsch, I., Mohr, J., Kashef, Y., Obermayer, K.

Source

IEEE/ACM Transactions on Audio, Speech, and Language Processing > 2017 > 25 > 6 > 1344 - 1356

Abstract

In realistic acoustic scenes, the detection of particular types of environmental sounds is often impeded by the simultaneous presence of multiple sound sources. In this work, we use simulations to systematically investigate the impact of superimposed distractor sources on sound type classification in a binaural robotic system and suggest techniques for increasing the robustness under such conditions. First, we demonstrate that by superimposing target sounds with strongly varying general environmental sounds during training, sound type classifiers are less affected by the presence of a distractor source. Moreover, we show that generalization performance of such models depends on how similar the angular source configuration and the signal-to-noise ratio are to the conditions under which the models were trained. Based on these results, we demonstrate how robust models can be obtained by including a variety of different conditions in the training data, a procedure called multi-conditional training. We evaluate this technique training both with ambient sources as well as with point sources with varying angular configurations, and show that this is an effective approach to produce models with close-to-optimal performance under a wide range of conditions. Moreover, we investigate the impact of head orientation and find that it has a significant influence on classification performance.