The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This work explores the use of Empirical Mode Decomposition (EMD) for discriminating speech regions from music in audio recordings. The different frequency scales or Intrinsic Mode Functions (IMFs) obtained from EMD of the audio signal are found to contain discriminatory evidence for distinguishing the speech regions from the music regions of the audio signal. Different statistical measures like mean,...
Audio classification serves as the fundamental step towards application like content based audio retrieval. In this work, we have tried to exploit the inherent difference in the composition of speech and music signal. A music signal has richer frequency component in comparison to speech signal. Energy distribution of speech and music signal also reflects a pattern that can be used to differentiate...
One of the major challenges in classification problems based on signal decomposition approach is to identify the right basis function and its derivatives that can provide optimal features to distinguish the classes. Local discriminant bases (LDB) algorithm is one such algorithm, which efficiently selects a set of significant basis functions from the library of orthonormal bases based on certain defined...
Music is an art form in which sounds are organized in time; however, current approaches for determining similarity and classification largely ignore temporal information. This paper presents an approach to automatic tagging which incorporates temporal aspects of music directly into the statistical models, unlike the typical bag-of-frames paradigm in traditional music information retrieval techniques...
In this paper, a new feature set is presented and evaluated based on sinusoidal modeling of audio signals. Duration of the longest sinusoidal model frequency track, as a measure of the harmony, is used and compared to typical features as input into an audio classifier. The performance of this sinusoidal model feature is evaluated through classification of audio to speech and music using both the GMM...
An audio classifier that can distinguish between speech, music, silence and garbage has been developed. The classifier was trained and tested on broadcast news material provided by VRT (Flemish Radio and Television Network). Several feature sets and machine learning algorithms have been tested, providing choices of speed and performance for a target system. The audio classifier is part of a greater...
We describe and analyze a discriminative algorithm for learning to align an audio signal with a given sequence of events that tag the signal. We demonstrate the applicability of our method for the tasks of speech-to-phoneme alignment (ldquoforced alignmentrdquo) and music-to-score alignment. In the first alignment task, the events that tag the speech signal are phonemes while in the music alignment...
Automatic discrimination of musical signal types as speech, singing, music, genres or drumbeats within audio streams is of great importance, e.g. for radio broadcast stream segmentation. Yet, feature sets are largely discussed. We therefore suggest a large open feature set approach starting with systematical generation of 7k hi-level features based on MPEG-7 low-level-descriptors and further feature...
The past decade has seen extensive research on audio classification and segmentation algorithms. However, the effect of background noise on the performance of classification has not been investigated widely. Recently, an early auditory model that calculates a so-called auditory spectrum has been employed in audio classification where excellent performance is reported along with robustness in noisy...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.