The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Singing is an act of producing musical sound with the voice. It is an integral part of Indian culture. Right from old era of mythological period to various medieval periods, we saw different genre and varieties of singers. The art of singing in India also varies from region to region and from one gharanas to another. It can be done as religious devotion, as a source of pleasure or ritual or as a part...
The perception of emotion is critical for social interactions. Nonlinguistic signals such as those in the human voice and musical instruments are used for communicating emotion. Using an adaptation paradigm, this study examines the extent to which common mental mechanisms are applied for emotion processing of instrumental and vocal sounds. In two experiments we show that prolonged exposure to affective...
This Manuscript probe delinquent of classification of uninterrupted of broad-spectrum aural data for content based recovery. This paper is dealing with scheme for classifying aural data & segmentation is also done on same data so that processing rate is faster. Aural data is able to classify into eight categories Simple speech, noise, silence, music single speech with music, double speech with...
This paper presents the comparison of five commonly used methods for fundamental frequency detection in speech signal, exactly in vocal and melodic instrument signals. The efficiency of chosen method is verified on known set of musical notes performed by bass clarinet. The highest efficiency in fundamental frequency detection was reached by AutoCorrelation (ACF) and Modified AutoCorrelation (MACF)...
This work explores the use of Empirical Mode Decomposition (EMD) for discriminating speech regions from music in audio recordings. The different frequency scales or Intrinsic Mode Functions (IMFs) obtained from EMD of the audio signal are found to contain discriminatory evidence for distinguishing the speech regions from the music regions of the audio signal. Different statistical measures like mean,...
In order to automatically extract the main melody contours from polyphonic music especially vocal melody songs, we present an effective approach based on a Bayesian framework. According to various information from the music signals, we use a pitch evolution model describing how pitch contour changes and an acoustic model representing the acoustic characteristics when the pitch is a hypothesized one,...
There are various kinds of sound signal analysis methods. Sinusoidal modeling, one of those signal analysis method, is based on the idea that all sound signal can be expressed as the sum of sinusoidal components of which instantaneous frequency and amplitude continuously vary with time. Sinusoidal modeling is known as a good model for sound signals, but it has been applied to the data which had only...
In this paper, an approach is presented that identifies music samples which are difficult for current state-of-the-art beat trackers. In order to estimate this difficulty even for examples without ground truth, a method motivated by selective sampling is applied. This method assigns a degree of difficulty to a sample based on the mutual disagreement between the output of various beat tracking systems...
In this paper, we propose a semi-supervised algorithm based on sparse non-negative matrix factorization (NMF) to improve separation of speech from background music in monaural signals. In our approach, fixed speech basis vectors are obtained from training data whereas music bases are estimated on-the-fly to cope with spectral variability while preserving small NMF dimensionality for decreased computation...
Audio classification serves as the fundamental step towards application like content based audio retrieval. In this work, we have tried to exploit the inherent difference in the composition of speech and music signal. A music signal has richer frequency component in comparison to speech signal. Energy distribution of speech and music signal also reflects a pattern that can be used to differentiate...
We present novel fast multi-pass decoding strategies for recognizing large named-entities on a low-resource embedded device and thus retrieving MP3 music using spoken query, which contains partial segments of whole music titles and artists. After acoustic-phonetic decoding in the first stage processing, we incorporate word boundary information with phonetic confusion matrix into next stage partial...
The historic acoustic-phonetic collection (HAPS) of the Dresden University of Technology [47] preserves historic material from more than 100 years of experimental phonetics in Germany and more than 50 years of speech technology in Dresden. The latter begun with the development of a channel vocoder in the 1950-th which was the starting point for continuous investigations in speech analysis and synthesis...
In this paper we propose an approach for the problem of single channel source separation of speech and music signals. Our approach is based on representing each source's power spectral density using dictionaries and nonlinearly projecting the mixture signal spectrum onto the combined span of the dictionary entries. We encourage sparsity and continuity of the dictionary coefficients using penalty terms...
A prerequisite for identifying the singers in popular music recordings is to reduce the interference of background accompaniment when trying to characterize the singer voice. This study proposes a background music removal approach for singer identification (SID) by exploiting the underlying relationships between solo voices and their accompanied versions in cepstrum. The relationships are characterized...
The goal of the Interactive Music Archive Access System (IMAAS) project was to develop an interactive music archive access system which was capable of allowing an end-user to easily extract rhythmic, melodic and harmonic musical metadata descriptors from audio, and allow the user to interact with the archive contents in a manner not typically allowed in archive access systems. To this end, the IMAAS...
Techniques of using a microphone array to determine a sound source location, the localization problem, has been studied for many years. A popular method is the so-called MUSIC (Multiple Signal Classification). There is a second type of method that tries to solve both sound separation and localization problems in one setting. The second method used for localization purpose is less known. In this study,...
A higher-order model to determine audibility of audio signals is presented. Previous models have been energy based (second order) and adequate only for stationary, narrow-band signals. Music, speech and other audio signals are nonstationary and wideband so traditional energy models poorly predict the audibility of these sounds. The predictions from the higher-order model are compared to actual subjective...
Extracting a singing voice from its music accompaniment can significantly facilitate certain applications of Music Information Retrieval including singer identification and singing melody extraction. In this paper, we present a hybrid approach for this purpose, which combines properties of the Azimuth Discrimination and Resynthesis (ADRess) method with Independent Component Analysis (ICA). Our proposed...
The extraction of local tempo and beat information from audio recordings constitutes a challenging task, particularly for music that reveals significant tempo variations. Furthermore, the existence of various pulse levels such as measure, tactus, and tatum often makes the determination of absolute tempo problematic. In this paper, we present a robust mid-level representation that encodes local tempo...
Expressing the similarity between musical streams is a challenging task as it involves the understanding of many factors which are most often blended into one information channel: the audio stream. Consequently, separating the musical audio stream into its main melody and its accompaniment may prove as being useful to root the similarity computation on a more robust and expressive representation....
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.