MVSS: Michigan Visual Sonification System

Jason Clemons; Sid Yingze Bao; Silvio Savarese; Todd Austin; Vinay Sharma

doi:10.1109/ESPA.2012.6152466

MVSS: Michigan Visual Sonification System

Clemons, Jason, Bao, Sid Yingze, Savarese, Silvio, Austin, Todd, Sharma, Vinay

Source

2012 IEEE International Conference on Emerging Signal Processing Applications > 143 - 146

Abstract

Visual Sonification is the process of converting visual properties of objects into sound signals. This paper describes the Michigan Visual Sonification System (MVSS) that utilizes this process to assist the visually impaired in distinguishing different objects in their surroundings. MVSS uses depth information to first segment and localize salient objects and then represents an object's appearance using histograms of visual features. A dictionary of invariant visual features (or words) is created in an a-priori off-line learning phase using Bag-of-Words modeling. The histogram of a segmented object is then converted to a sound signal, the volume and 3D placement of which is determined by the relative position of the object with respect to the user. The system then relies on the considerable discriminating power of the human brain to localize and “classify” the sound, thus enabling the user to distinguish between visually distinct object classes. This paper describes the different components of MVSS in detail and presents some promising initial experimental results.