Search results

Items from 1 to 9 out of 9 results

chapter

Speech vs music discrimination using Empirical Mode Decomposition

Banriskhem K. Khonglah, Rajib Sharma, S. R. Mahadeva Prasanna

2015 Twenty First National Conference on Communications (NCC) > 1 - 6

2015 Twenty First National Conference on Communications (NCC)

This work explores the use of Empirical Mode Decomposition (EMD) for discriminating speech regions from music in audio recordings. The different frequency scales or Intrinsic Mode Functions (IMFs) obtained from EMD of the audio signal are found to contain discriminatory evidence for distinguishing the speech regions from the music regions of the audio signal. Different statistical measures like mean,...

chapter

Speech/Music Classification Using Empirical Mode Decomposition

A Ghosal, B C Dhara, S K Saha

2011 Second International Conference on Emerging Applications of Information Technology > 49 - 52

Second International Conference on Emerging Applications of Information Technology (EAIT 2011)

Audio classification serves as the fundamental step towards application like content based audio retrieval. In this work, we have tried to exploit the inherent difference in the composition of speech and music signal. A music signal has richer frequency component in comparison to speech signal. Energy distribution of speech and music signal also reflects a pattern that can be used to differentiate...

chapter

Modified Local Discriminant Bases and Its Application in Audio Feature Extraction

Zheng Jiming, Wei Guohua, Yang Chunde

2009 International Forum on Information Technology and Applications > 3 > 49 - 52

2009 International Forum on Information Technology and Applications (IFITA)

One of the major challenges in classification problems based on signal decomposition approach is to identify the right basis function and its derivatives that can provide optimal features to distinguish the classes. Local discriminant bases (LDB) algorithm is one such algorithm, which efficiently selects a set of significant basis functions from the library of orthonormal bases based on certain defined...

chapter

On the importance of modeling temporal information in music tag annotation

J. Reed, Chin-Hui Lee

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 1873 - 1876

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

Music is an art form in which sounds are organized in time; however, current approaches for determining similarity and classification largely ignore temporal information. This paper presents an approach to automatic tagging which incorporates temporal aspects of music directly into the statistical models, unlike the typical bag-of-frames paradigm in traditional music information retrieval techniques...

chapter

Audio classification based on sinusoidal model: A new feature

J. Shirazi, S. Ghaemmaghami

TENCON 2008 - 2008 IEEE Region 10 Conference > 1 - 5

TENCON 2008 - 2008 IEEE Region 10 Conference

In this paper, a new feature set is presented and evaluated based on sinusoidal modeling of audio signals. Duration of the longest sinusoidal model frequency track, as a measure of the harmony, is used and compared to typical features as input into an audio classifier. The performance of this sinusoidal model feature is evaluated through classification of audio to speech and music using both the GMM...

chapter

A Speech/Music/Silence/Garbage/ Classifier for Searching and Indexing Broadcast News Material

Y. Patsis, W. Verhelst

2008 19th International Conference on Database and Expert Systems Applications > 585 - 589

2008 19th International Conference on Database and Expert Systems Applications (DEXA)

An audio classifier that can distinguish between speech, music, silence and garbage has been developed. The classifier was trained and tested on broadcast news material provided by VRT (Flemish Radio and Television Network). Several feature sets and machine learning algorithms have been tested, providing choices of speed and performance for a target system. The audio classifier is part of a greater...

article

A Large Margin Algorithm for Speech-to-Phoneme and Music-to-Score Alignment

J. Keshet, S. Shalev-Shwartz, Y. Singer, D. Chazan

IEEE Transactions on Audio, Speech, and Language Processing > 2007 > 15 > 8 > 2373 - 2382

We describe and analyze a discriminative algorithm for learning to align an audio signal with a given sequence of events that tag the signal. We demonstrate the applicability of our method for the tasks of speech-to-phoneme alignment (ldquoforced alignmentrdquo) and music-to-score alignment. In the first alignment task, the events that tag the speech signal are phonemes while in the music alignment...

chapter

Musical Signal Type Discrimination based on Large Open Feature Sets

B. Schuller, F. Wallhoff, D. Arsic, G. Rigoll

2006 IEEE International Conference on Multimedia and Expo > 1089 - 1092

2006 IEEE International Conference on Multimedia and Expo

Automatic discrimination of musical signal types as speech, singing, music, genres or drumbeats within audio streams is of great importance, e.g. for radio broadcast stream segmentation. Yet, feature sets are largely discussed. We therefore suggest a large open feature set approach starting with systematical generation of 7k hi-level features based on MPEG-7 low-level-descriptors and further feature...

chapter

A Simplified Early Auditory Model with Application in Speech/Music Classification

Wei Chu, Benoit Champagne

2006 Canadian Conference on Electrical and Computer Engineering > 775 - 778

2006 Canadian Conference on Electrical and Computer Engineering

The past decade has seen extensive research on audio classification and segmentation algorithms. However, the effect of background noise on the performance of classification has not been investigated widely. Recently, an early auditory model that calculates a so-called auditory spectrum has been employed in audio classification where excellent performance is reported along with robustness in noisy...

Filter options

Keywords:
SPEECH PROCESSING
SUPPORT VECTOR MACHINES

Publication date

Set your own date range

Publication type

book (8)
article (1)

Keywords

AUDIO SIGNAL PROCESSING (7)
SIGNAL CLASSIFICATION (7)
SPEECH (6)
FEATURE EXTRACTION (5)
AUDIO CLASSIFICATION (4)
CLASSIFICATION ALGORITHMS (3)
SUPPORT VECTOR MACHINE (3)
ACOUSTICS (2)
EMPIRICAL MODE DECOMPOSITION (2)
INFORMATION RETRIEVAL (2)
ITERATIVE METHODS (2)
LEARNING (ARTIFICIAL INTELLIGENCE) (2)
MULTIPLE SIGNAL CLASSIFICATION (2)
MUSIC CLASSIFICATION (2)
SPEECH CLASSIFICATION (2)
ABSTRACT INNER-PRODUCT SPACE (1)
ABSTRACT VECTOR SPACE (1)
ACCURACY (1)
ACOUSTIC SEGMENT MODEL (1)
ACOUSTIC SIGNAL PROCESSING (1)
AUDIO CLASSIFIER (1)
AUDIO DATABASES (1)
AUDIO FEATURE EXTRACTION (1)
AUDIO FEATURES (1)
AUDIO RECOGNITION (1)
AUDIO SEGMENTATION (1)
AUDIO SIGNAL ALIGNMENT (1)
AUDIO STREAMING (1)
AUDITORY SPECTRUM (1)
BACKGROUND CATEGORIZATION (1)
BAG-OF-FRAMES PARADIGM (1)
BAUM-WELCH ESTIMATION (1)
BINARY CLASSIFICATION (1)
BROADCAST NEWS (1)
BROADCAST NEWS MATERIAL (1)
CONTENT BASED AUDIO RETRIEVAL (1)
DATABASES (1)
DISCRIMINATIVE ALIGNMENT ALGORITHM (1)
EARLY AUDITORY MODEL (1)
EMD (1)
ENERGY DISTRIBUTION (1)
EVENT SEQUENCE (1)
FEATURE SELECTION (1)
FFT-BASED SPECTRUM (1)
FORCED ALIGNMENT (1)
GAUSSIAN PROCESSES (1)
GENETIC ALGORITHMS (1)
GENETIC SEARCH (1)
GMM (1)
HIDDEN MARKOV MODEL (1)
HIDDEN MARKOV MODELS (1)
IMF (1)
INDEXING (1)
ITERATIVE ALGORITHM (1)
ITERATIVE DECODING (1)
ITERATIVE PROCESS (1)
LARGE MARGIN ALGORITHM (1)
LARGE MARGIN AND KERNEL METHODS (1)
LATENT SEMANTIC ANALYSIS (1)
LDB (1)
LEARNING TASK (1)
LOCAL TIMBRAL CHARACTERISTIC (1)
MACHINE LEARNING (1)
MATERIALS (1)
MEL FREQUENCY CEPSTRAL COEFFICIENT (1)
MINIMISATION (1)
MINIMIZATION PROBLEM (1)
MODIFIED LOCAL DISCRIMINANT BASES (1)
MPEG-7 LOW-LEVEL-DESCRIPTOR (1)
MULTIMEDIA COMMUNICATION (1)
MUSIC INFORMATION RETRIEVAL TECHNIQUE (1)
MUSIC SIGNAL (1)
MUSIC TAG ANNOTATION (1)
MUSIC-TO-SCORE ALIGNMENT (1)
MUSICAL SIGNAL TYPE DISCRIMINATION (1)
NOISE (1)
NOISE-ROBUSTNESS (1)
NOISY TEST CASE (1)
NONLINEAR PROCESSING (1)
OPEN FEATURE SET APPROACH (1)
ORTHONORMAL BASES (1)
PROBABILITY DENSITY FUNCTION (1)
PROBABILITY DISTRIBUTION (1)
PUBLIC COLUMBIA SMD DATABASE (1)
RHYTHM (1)
SEARCHING (1)
SIGNAL DECOMPOSITION APPROACH (1)
SIGNAL PROCESSING (1)
SIGNIFICANT BASIS FUNCTIONS (1)
SIMPLIFIED AUDITORY MODEL (1)
SINUSOIDAL MODEL (1)
SONG SEGMENTS (1)
SPEECH MUSIC DISCRIMINATION (1)
SPEECH SIGNAL (1)
SPEECH SYNTHESIS (1)
SPEECH-MUSIC-SILENCE-GARBAGE CLASSIFIER (1)
SPEECH-TO-PHONEME ALIGNMENT (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options