Visual Sonification is the process of converting visual properties of objects into sound signals. This paper describes the Michigan Visual Sonification System (MVSS) that utilizes this process to assist the visually impaired in distinguishing different objects in their surroundings. MVSS uses depth information to first segment and localize salient objects and then represents an object's appearance using histograms of visual features. A dictionary of invariant visual features (or words) is created in an a-priori off-line learning phase using Bag-of-Words modeling. The histogram of a segmented object is then converted to a sound signal, the volume and 3D placement of which is determined by the relative position of the object with respect to the user. The system then relies on the considerable discriminating power of the human brain to localize and “classify” the sound, thus enabling the user to distinguish between visually distinct object classes. This paper describes the different components of MVSS in detail and presents some promising initial experimental results.