The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Content-based music information retrieval tasks have traditionally been solved using engineered features and shallow processing architectures. In recent years, there has been increasing interest in using feature learning and deep architectures instead, thus reducing the required engineering effort and the need for prior knowledge. However, this new approach typically still relies on mid-level representations...
This paper presents an intra-note segmentation method for mono-phonic recordings based on acoustic feature variation; each musical note is separated into onset, steady and offset states. The task of intra-note segmentation from audio signals is detecting change points of acoustic feature. In proposed method, the Markov process is assumed on state transition, and time-varying acoustic feature is represented...
The auditory cortex in the brain does effortlessly a better job of extracting information from the acoustic world than our current generation of signal processing algorithms. Abstracting the principles of the auditory cortex, the proposed architecture is based on Kalman filters with hierarchically coupled state models that stabilize the input dynamics and provide a representation space. This approach...
This paper presents a vocal timbre analysis method based on topic modeling using latent Dirichlet allocation (LDA). Although many works have focused on analyzing characteristics of singing voices, none have dealt with “latent” characteristics (topics) of vocal timbre, which are shared by multiple singing voices. In the work described in this paper, we first automatically extracted vocal timbre features...
Musical onset detection is one of the most elementary tasks in music analysis, but still only solved imperfectly for polyphonic music signals. Interpreted as a computer vision problem in spectrograms, Convolutional Neural Networks (CNNs) seem to be an ideal fit. On a dataset of about 100 minutes of music with 26k annotated onsets, we show that CNNs outperform the previous state-of-the-art while requiring...
Transcribing lyrics from musical audio is a challenging research problem which has not benefited from many advances made in the related field of automatic speech recognition, owing to the prevalent musical accompaniment and differences between the spoken and sung voice. However, one aspect of this problem which has yet to be exploited by researchers is that significant portions of the lyrics will...
Onset detection forms the critical first stage of most beat tracking algorithms. While common spectral-difference onset detectors can work well in genres with clear rhythmic structure, they can be sensitive to loud, asynchronous events (e.g., off-beat notes in a jazz solo), which limits their general efficacy. In this paper, we investigate methods to improve the robustness of onset detection for beat...
This paper proposes a novel machine learning approach for the task of on-line continuous-time music mood regression, i.e., low-latency prediction of the time-varying arousal and valence in musical pieces. On the front-end, a large set of segmental acoustic features is extracted to model short-term variations. Then, multi-variate regression is performed by deep recurrent neural networks to model longer-range...
Representing music information using audio codewords has led to state-of-the-art performance on various music classifcation benchmarks. Comparing to conventional audio descriptors, audio words offer greater fexibility in capturing the nuance of music signals, in that each codeword can be viewed as a quantization of the music universe and that the quantization goes fner as the size of the dictionary...
This paper focuses on the automatic rhythm analysis of musical audio at the bar level. We propose a novel approach for robust downbeat detection. It uses well-chosen complementary features, inspired by musical considerations. In particular, a note accentuation model and a detection of pattern changes are introduced. We estimate the time signature by examining the similarity of frames at the beat level...
Tempo estimation is a fundamental problem in music information retrieval. Most approaches attempt to solve two problems: first finding a dominant pulse and second correcting the metrical level of this pulse. The latter has also been dubbed fixing the octave error. We propose an algorithm for tempo estimation that addresses both problems mostly independently. While using a standard pulse detection...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.