ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

chapter

End-to-end learning for music audio

Sander Dieleman, Benjamin Schrauwen

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 6964 - 6968

Content-based music information retrieval tasks have traditionally been solved using engineered features and shallow processing architectures. In recent years, there has been increasing interest in using feature learning and deep architectures instead, thus reducing the required engineering effort and the need for prior knowledge. However, this new approach typically still relies on mid-level representations...

chapter

Intra-note segmentation via sticky HMM with DP emission

Yuma Koizumi, Katunobu Itou

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2144 - 2148

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper presents an intra-note segmentation method for mono-phonic recordings based on acoustic feature variation; each musical note is separated into onset, steady and offset states. The task of intra-note segmentation from audio signals is detecting change points of acoustic feature. In proposed method, the Markov process is assumed on state transition, and time-varying acoustic feature is represented...

chapter

Clustering of time series using a hierarchical linear dynamical system

Goktug T. Cinar, Jose C. Principe

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 6741 - 6745

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The auditory cortex in the brain does effortlessly a better job of extracting information from the acoustic world than our current generation of signal processing algorithms. Abstracting the principles of the auditory cortex, the proposed architecture is based on Kalman filters with hierarchically coupled state models that stabilize the input dynamics and provide a representation space. This approach...

chapter

Vocal timbre analysis using latent Dirichlet allocation and cross-gender vocal timbre similarity

Tomoyasu Nakano, Kazuyoshi Yoshii, Masataka Goto

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5202 - 5206

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper presents a vocal timbre analysis method based on topic modeling using latent Dirichlet allocation (LDA). Although many works have focused on analyzing characteristics of singing voices, none have dealt with “latent” characteristics (topics) of vocal timbre, which are shared by multiple singing voices. In the work described in this paper, we first automatically extracted vocal timbre features...

chapter

Improved musical onset detection with Convolutional Neural Networks

Jan Schluter, Sebastian Bock

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 6979 - 6983

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Musical onset detection is one of the most elementary tasks in music analysis, but still only solved imperfectly for polyphonic music signals. Interpreted as a computer vision problem in spectrograms, Convolutional Neural Networks (CNNs) seem to be an ideal fit. On a dataset of about 100 minutes of music with 26k annotated onsets, we show that CNNs outperform the previous state-of-the-art while requiring...

chapter

Leveraging repetition for improved automatic lyric transcription in popular music

Matt McVicar, Daniel P W Ellis, Masataka Goto

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 3117 - 3121

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Transcribing lyrics from musical audio is a challenging research problem which has not benefited from many advances made in the related field of automatic speech recognition, owing to the prevalent musical accompaniment and differences between the spoken and sung voice. However, one aspect of this problem which has yet to be exploited by researchers is that significant portions of the lyrics will...

chapter

Better beat tracking through robust onset aggregation

Brian McFee, Daniel P.W. Ellis

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2154 - 2158

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Onset detection forms the critical first stage of most beat tracking algorithms. While common spectral-difference onset detectors can work well in genres with clear rhythmic structure, they can be sensitive to loud, asynchronous events (e.g., off-beat notes in a jazz solo), which limits their general efficacy. In this paper, we investigate methods to improve the robustness of onset detection for beat...

chapter

On-line continuous-time music mood regression with deep recurrent neural networks

Felix Weninger, Florian Eyben, Bjorn Schuller

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5412 - 5416

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper proposes a novel machine learning approach for the task of on-line continuous-time music mood regression, i.e., low-latency prediction of the time-varying arousal and valence in musical pieces. On the front-end, a large set of segmental acoustic features is extracted to model short-term variations. Then, multi-variate regression is performed by deep recurrent neural networks to model longer-range...

chapter

Modified lasso screening for audio word-based music classification using large-scale dictionary

Ping-Keng Jao, Chin-Chia Michael Yeh, Yi-Hsuan Yang

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5207 - 5211

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Representing music information using audio codewords has led to state-of-the-art performance on various music classifcation benchmarks. Comparing to conventional audio descriptors, audio words offer greater fexibility in capturing the nuance of music signals, in that each codeword can be viewed as a quantization of the music universe and that the quantization goes fner as the size of the dictionary...

chapter

Enhancing downbeat detection when facing different music styles

Simon Durand, Bertrand David, Gael Richard

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 3132 - 3136

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper focuses on the automatic rhythm analysis of musical audio at the bar level. We propose a novel approach for robust downbeat detection. It uses well-chosen complementary features, inspired by musical considerations. In particular, a note accentuation model and a detection of pattern changes are introduced. We estimate the time signature by examining the similarity of frames at the beat level...

chapter

Exploiting global features for tempo octave correction

Hendrik Schreiber, Meinard Muller

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 639 - 643

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Tempo estimation is a fundamental problem in music information retrieval. Most approaches attempt to solve two problems: first finding a dominant pulse and second correcting the metrical level of this pulse. The latter has also been dubbed fixing the octave error. We propose an algorithm for tempo estimation that addresses both problems mostly independently. While using a standard pulse detection...

INFONA - science communication portal

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

End-to-end learning for music audio

Intra-note segmentation via sticky HMM with DP emission

Clustering of time series using a hierarchical linear dynamical system

Vocal timbre analysis using latent Dirichlet allocation and cross-gender vocal timbre similarity

Improved musical onset detection with Convolutional Neural Networks

Leveraging repetition for improved automatic lyric transcription in popular music

Better beat tracking through robust onset aggregation

On-line continuous-time music mood regression with deep recurrent neural networks

Modified lasso screening for audio word-based music classification using large-scale dictionary

Enhancing downbeat detection when facing different music styles

Exploiting global features for tempo octave correction

Filter options

Publication date

Keywords

INFONA - science communication portal

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)