Search results

Items from 1 to 20 out of 171 results

chapter

Noise robust speech recognition system using Mel cepstral and genetic algorithm

Garg Mamta, Arora Ajat Shatru, Gupta Savita

2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT) > 3151 - 3155

2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)

This paper suggested a technique based on MFCC analysis for audio signals with speech classification application. The proposed work used multi-resolution (wavelet) analysis and spectral analysis based features for feature extraction. The proposed approach uses a no. of features like Mel Frequency Cepstral Coefficient (MFCC), and FFT Coefficients combined with wavelet based features. In addition, accuracy...

chapter

A Preliminary Study on Deep-Learning Based Screaming Sound Detection

Md. Zaigham Zaheer, Jin Young Kim, Hyoung-Gook Kim, Seung You Na

2015 5th International Conference on IT Convergence and Security (ICITCS) > 1 - 4

2015 5th International Conference on IT Convergence and Security (ICITCS)

In addition to the traditional video surveillance, various audio processing techniques can also be added to the existing CCTV cameras. They can be used as additional features to help in analyzing the scene better and autonomously detecting violence or any unwanted activity in the scene. For this purpose, a deep learning based scream sound detection approach is proposed in this paper. MFCC features...

chapter

Guitar model recognition from single instrument audio recordings

David Johnson, George Tzanetakis

2015 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM) > 370 - 375

2015 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM)

The main goal of this paper is to explore the recognition of particular guitar models from single instrument audio recordings. This is different than existing work in music instrument recognition that deals with identifying different instrument types. Through a set of experiments we evaluate different sets of audio features and classifiers for this purpose. To improve accuracy a composite classifier...

chapter

Feature selection experiments on emotional speech classification

Piyawat Sukhummek, Sawit Kasuriya, Thanaruk Theeramunkong, Chai Wutiwiwatchai, more

2015 12th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON) > 1 - 4

2015 12th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON)

This paper presents the experiments on feature selection for emotional speech classification. There are 152 features used in this experiment. The minimum redundancy maximum relevance (mRMR) feature selection is applied as the features selection. The experiments are constructed from two corpora; Interactive Emotional Dyadic Motion Capture (IEMOCAP) and Emotional Tagged Corpus on Lakorn (EMOLA) which...

chapter

Feature extraction analysis on Indonesian speech recognition system

Untari N. Wisesty, Adiwijaya, Widi Astuti

2015 3rd International Conference on Information and Communication Technology (ICoICT) > 54 - 58

2015 3rd International Conference on Information and Communication Technology (ICoICT )

Speech recognition is widely applied to speech to text, speech to emotion, in order to make gadget and computer easier to use, or to help people with hearing disability. Feature extraction is one of significant step in the performance of speech recognition. Therefore, the proper selection is really needed. In this paper, we analyze feature extraction that can have good performance for Indonesian speech...

chapter

Classification of vocal and non-vocal regions from audio songs using spectral features and pitch variations

Y. V. Srinivasa Murthy, Shashidhar G. Koolagudi

2015 IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE) > 1271 - 1276

2015 IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE)

In this work, an effort has been made to identify vocal and non-vocal regions from a given song using signal processing techniques and machine learning algorithm. Initially spectral features like mel-frequency cepstral coefficients (MFCCs) are used to develop the baseline system. Statistical values of pitch, jitter and shimmer are considered to improve performance of the system. Artificial neural...

chapter

Automatic identification of bird species: A comparison between kNN and SOM classifiers

Dorota Kaminska, Artur Gmerek

2012 Joint Conference New Trends In Audio & Video And Signal Processing: Algorithms, Architectures, Arrangements And Applications (NTAV/SPA) > 77 - 82

2012 Joint Conference New Trends in Audio & Video and Signal Processing: Algorithms, Architectures, Arrangements, and Applications (NTAV/SPA)

This paper presents a system for automatic bird identification, which uses audio input. The experiments have been conducted on three groups of birds, which were created basing finishing on classification, the system is fully automated. The main problem in automatic bird recognition (ABR) is the choice of proper features and classifiers. Identification has been made using two classifiers-kNN (k Nearest...

chapter

Text-constrained speaker verification using fuzzy C means vector quantization

Debnath Saswati, Soni Badal, Das Pradip K.

2015 International Conference on Communications and Signal Processing (ICCSP) > 1511 - 1515

2015 International Conference on Communications and Signal Processing (ICCSP)

The most successful approach to speech and speaker recognition is to treat the speech signal as a stochastic pattern and to use a statistical pattern recognition technique for matching utterances. This paper attempts to study the performance of Text dependent speaker verification system using Delta-Delta Mel Frequency Cepstral Coefficients (MFCC-Δ-Δ) feature vector and Fuzzy C means (FCM) speaker...

chapter

Detection of depression in adolescents based on statistical modeling of emotional influences in parent-adolescent conversations

Melissa N Stolar, Margaret Lech, Nicholas B Allen

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 987 - 991

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The current benchmark speech-based depression detection techniques rely on acoustic speech parameters collected from large sets of representative speech recordings. This study for the first time investigates depression detection based on the higher order influence model (HOIM) coefficients and emotional transition parameters derived from a relatively small set of conversational speech recordings representing...

chapter

A unified framework for filterbank and time-frequency basis vectors in ASR frontends

Xiaoyu Liu, Stephen A. Zahorian

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4659 - 4663

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

For many years, filterbanks have been widely used as one step of frontend feature extraction for Automatic Speech Recognition (ASR). In this paper, we propose a unified framework for ASR frontends, by first moving the nonlinear amplitude scaling, and then combining the filterbank weights with the cosine basis vectors. As part of this framework, we also show that the delta terms used to encode feature...

chapter

Content based clinical depression detection in adolescents

Lu-Shih Alex Low, Namunu C. Maddage, Margaret Lech, Lisa Sheeber, more

2009 17th European Signal Processing Conference > 2362 - 2366

2009 17th European Signal Processing Conference

This paper studies the effectiveness of speech contents for detecting clinical depression in adolescents. We also evaluated the performances of acoustic features such as Mel frequency cepstral coefficients (MFCC), short time energy (Energy), zero crossing rate (ZCR) and Teager energy operator (TEO) using Gaussian mixture models for depression detection. A clinical data set of speech from 139 adolescents,...

chapter

Akshara transcription of mrudangam strokes in Carnatic music

Jom Kuriakose, J Chaitanya Kumar, Padi Sarala, Hema A Murthy, more

2015 Twenty First National Conference on Communications (NCC) > 1 - 6

2015 Twenty First National Conference on Communications (NCC)

Percussion instruments play a significant role in Carnatic music concerts. The percussion artist enjoys a great degree of freedom in improvising within the defined tāla structure of a composition. The objective of this paper is to transcribe the improvisations, treating the percussion strokes as syllables or aksharas.

chapter

Raga identification of carnatic music using iterative clustering approach

Hannah Daniel, A. Revathi

2015 International Conference on Computing and Communications Technologies (ICCCT) > 19 - 24

2015 International Conference on Computing and Communications Technologies (ICCCT)

This paper proposes a method to identify the arohana-avarohana of carnatic raga. Carnatic raga is broadly classified as melakarta (parent) and janya (child) raga. Arohana-avarohana of 10 different ragas is collected from 16 different singers. 16 audio data are collected for each raga. 11 among the 16 are used in the training phase and the remaining 5 are used for testing. The acoustic feature, MFCC...

chapter

Speaker based Language Independent Isolated Speech Recognition System

Shanthi Therese S., Chelpa Lingam

2015 International Conference on Communication, Information & Computing Technology (ICCICT) > 1 - 7

2015 International Conference on Communication, Information & Computing Technology (ICCICT)

This paper presents a speaker based Language Independent Isolated Speech Recognition System (LIISRS). The most popular feature extraction technique Mel Frequency Cepstral Coefficients (MFCC) is used for training the system. Representative specific features are identified using K-Means algorithm. Distortion measure is calculated using Euclidian distance function. Pitch contour characteristics are used...

chapter

A unique approach in text independent speaker recognition using MFCC feature sets and probabilistic neural network

Khan Suhail Ahmad, Anil S. Thosar, Jagannath H. Nirmal, Vinay S. Pande

2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR) > 1 - 6

2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR)

This paper motivates the use of combination of mel frequency cepstral coefficients (MFCC) and its delta derivatives (DMFCC and DDMFCC) calculated using mel spaced Gaussian filter banks for text independent speaker recognition. MFCC modeled on the human auditory system shows robustness against noise and session changes and hence has become synonymous with speaker recognition. Our main aim is to test...

chapter

Classification of emotions from speech using implicit features

Mohit Srivastava, Anupam Agarwal

2014 9th International Conference on Industrial and Information Systems (ICIIS) > 1 - 6

2014 9th International Conference on Industrial and Information Systems (ICIIS)

Human computer interaction with the time has extended its branches to many different other fields like engineering, cognition, medical etc. Speech analysis has also become an important area of concern. People involved are using this mode for the interaction with the machines to bridge the gap between physical and digital world. Speech emotion recognition has become an integral subfield in the domain...

chapter

Improved robustness of biometrie authentication system using features of utterance

Qian Shi, Yoshinobu Kajikawa

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 4

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In this paper, we propose a novel biométrie authentication system using motion vectors of lips. We have already proposed a biométrie authentication system using multimodal features of utterance. However, since both the edges and texture of lips can be easily extracted from a still image, an imposter may be recognized as a registrant by using a still image of the registrant. Therefore, the robustness...

chapter

Classification of respiratory pathology in pulmonary acoustic signals using parametric features and artificial neural network

Rajkumar Palaniappan, Kenneth Sundaraj, Sebastian Sundaraj, N. Huliraj, more

2014 IEEE International Conference on Computational Intelligence and Computing Research > 1 - 6

2014 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC)

Pulmonary acoustic signal analysis provides essential information on the present state of the Lungs. In this paper, we intend to distinguish between normal, airway obstruction pathology and interstitial lung disease using pulmonary acoustic signal recordings. The proposed method extracts Mel frequency cepstral coefficients (MFCC) and AR Coefficients as features from pulmonary acoustic signals. The...

chapter

Feature extraction using Spectral Centroid and Mel Frequency Cepstral Coefficient for Quranic Accent Automatic Identification

Noraziahtulhidayu Kamarudin, S.A.R Al-Haddad, Shaiful Jahari Hashim, Mohammad Ali Nematollahi, more

2014 IEEE Student Conference on Research and Development > 1 - 6

2014 IEEE Student Conference on Research and Development (SCOReD)

This paper presents the process of Quranic Accent Automatic Identification. Recent feature extraction technique that is used for Quranic verse rule identification/Tajweed include Mel Frequency Cepstral Coefficients (MFCC) which prone to additive noise and may reduce the classification result. Therefore, to improve the performance of MFCC with addition of Spectral Centroid features and is proposed...

chapter

Classification of emphatic consonants and their counterparts in Modern Standard Arabic using neural networks

Yasser M. Seddiq, Yousef A. Alotaibi, Sid-Ahmed Selouani

2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) > 73 - 77

2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)

This paper presents the work of acoustic analysis related to Modern Standard Arabic (MSA). The problem of classifying the consonant counterparts in MSA is tackled here. The study considers four phonemes: /d^ˤ, ð^ˤ/ and their non-emphatic counterparts /d, ð/ respectively. An accurate automatic classification for those phonemes is to be achieved. Artificial neural networks (ANNs) are used for that purpose...

Data set:
ieee
Keywords:
FEATURE EXTRACTION
ACCURACY
MEL FREQUENCY CEPSTRAL COEFFICIENT

Publication date

Set your own date range

Content availability

Available (170)
None (1)

Publication type

book (167)
article (4)

Keywords

SPEECH (92)
SPEECH RECOGNITION (57)
TRAINING (36)
SUPPORT VECTOR MACHINES (35)
MFCC (33)
SPEAKER RECOGNITION (31)
CLASSIFICATION ALGORITHMS (28)
HIDDEN MARKOV MODELS (28)
DATABASES (26)
SPEECH PROCESSING (26)
AUDIO SIGNAL PROCESSING (20)
MUSIC (19)
SUPPORT VECTOR MACHINE CLASSIFICATION (19)
CEPSTRAL ANALYSIS (18)
FILTER BANKS (15)
INSTRUMENTS (15)
ARTIFICIAL NEURAL NETWORKS (14)
NOISE (14)
SPEAKER IDENTIFICATION (13)
GMM (12)
PATTERN CLASSIFICATION (12)
COMPUTATIONAL MODELING (11)
DATA MINING (11)
ROBUSTNESS (11)
SUPPORT VECTOR MACHINE (11)
TESTING (11)
VECTORS (11)
GAUSSIAN PROCESSES (10)
SIGNAL CLASSIFICATION (10)
SVM (10)
CORRELATION (9)
FILTER BANK (9)
MEL FREQUENCY CEPSTRAL COEFFICIENTS (9)
PRINCIPAL COMPONENT ANALYSIS (9)
SIGNAL PROCESSING (9)
AUTOMATIC SPEECH RECOGNITION (8)
CLASSIFICATION (8)
COMPUTERS (8)
EMOTION RECOGNITION (8)
FEATURE SELECTION (8)
KERNEL (8)
PATTERN RECOGNITION (8)
ACOUSTIC SIGNAL PROCESSING (7)
ACOUSTICS (7)
GAUSSIAN MIXTURE MODEL (7)
MATHEMATICAL MODEL (7)
VECTOR QUANTIZATION (7)
CONFERENCES (6)
EQUATIONS (6)
FILTERING THEORY (6)
MULTILAYER PERCEPTRONS (6)
SIGNAL PROCESSING ALGORITHMS (6)
SIGNAL TO NOISE RATIO (6)
ALGORITHM DESIGN AND ANALYSIS (5)
CEPSTRUM (5)
EDUCATIONAL INSTITUTIONS (5)
FILTERING (5)
HIDDEN MARKOV MODEL (5)
MACHINE LEARNING (5)
MEL-FREQUENCY CEPSTRAL COEFFICIENTS (5)
MULTIMEDIA COMMUNICATION (5)
MUSIC INFORMATION RETRIEVAL (5)
NOISE MEASUREMENT (5)
PEDIATRICS (5)
POLYNOMIALS (5)
SPEAKER VERIFICATION (5)
SPECTRAL ANALYSIS (5)
TIMBRE (5)
TIME FREQUENCY ANALYSIS (5)
TRAINING DATA (5)
AUDIO CLASSIFICATION (4)
CLUSTERING ALGORITHMS (4)
CLUSTERING METHODS (4)
DATA MODELS (4)
ELECTRONIC MAIL (4)
FREQUENCY MODULATION (4)
GAUSSIAN MIXTURE MODELS (4)
HARMONIC ANALYSIS (4)
INFORMATION RETRIEVAL (4)
LPC (4)
MEL FREQUENCY CEPSTRAL COEFFICIENTS (MFCC) (4)
MODULATION (4)
MULTIPLE SIGNAL CLASSIFICATION (4)
MUSIC GENRE CLASSIFICATION (4)
MUSICAL INSTRUMENTS (4)
OPTIMIZATION (4)
PATTERN CLUSTERING (4)
PSYCHOLOGY (4)
SPEECH ANALYSIS (4)
SPEECH CLASSIFICATION (4)
TIME-FREQUENCY ANALYSIS (4)
TRANSFORMS (4)
VISUALIZATION (4)
ACOUSTIC FEATURES (3)
ACOUSTIC MEASUREMENTS (3)
ACOUSTIC MODEL (3)
ADAPTATION MODEL (3)
more

INFONA - science communication portal

Search results

Noise robust speech recognition system using Mel cepstral and genetic algorithm

A Preliminary Study on Deep-Learning Based Screaming Sound Detection

Guitar model recognition from single instrument audio recordings

Feature selection experiments on emotional speech classification

Feature extraction analysis on Indonesian speech recognition system

Classification of vocal and non-vocal regions from audio songs using spectral features and pitch variations

Automatic identification of bird species: A comparison between kNN and SOM classifiers

Text-constrained speaker verification using fuzzy C means vector quantization

Detection of depression in adolescents based on statistical modeling of emotional influences in parent-adolescent conversations

A unified framework for filterbank and time-frequency basis vectors in ASR frontends

Content based clinical depression detection in adolescents

Akshara transcription of mrudangam strokes in Carnatic music

Raga identification of carnatic music using iterative clustering approach

Speaker based Language Independent Isolated Speech Recognition System

A unique approach in text independent speaker recognition using MFCC feature sets and probabilistic neural network

Classification of emotions from speech using implicit features

Improved robustness of biometrie authentication system using features of utterance

Classification of respiratory pathology in pulmonary acoustic signals using parametric features and artificial neural network

Feature extraction using Spectral Centroid and Mel Frequency Cepstral Coefficient for Quranic Accent Automatic Identification

Classification of emphatic consonants and their counterparts in Modern Standard Arabic using neural networks

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options