Search results

Items from 1 to 20 out of 58 results

chapter

An approach for self-training audio event detectors using web data

Benjamin Elizalde, Ankit Shah, Siddharth Dalmia, Min Hun Lee, more

2017 25th European Signal Processing Conference (EUSIPCO) > 1863 - 1867

2017 25th European Signal Processing Conference (EUSIPCO)

Audio Event Detection (AED) aims to recognize sounds within audio and video recordings. AED employs machine learning algorithms commonly trained and tested on annotated datasets. However, available datasets are limited in number of samples and hence it is difficult to model acoustic diversity. Therefore, we propose combining labeled audio from a dataset and unlabeled audio from the web to improve...

chapter

Classification of the syllables sound using wavelet, Renyi entropy and AR-PSD features

Domy Kristomo, Risanuri Hidayat, Indah Soesanti

2017 IEEE 13th International Colloquium on Signal Processing & its Applications (CSPA) > 94 - 99

2017 IEEE 13th International Colloquium on Signal Processing & its Applications (CSPA)

Feature extraction plays a very important role in the speech classification process because a better feature is good for improving the classification rate. This paper presents a speech feature extraction method by using Discrete Wavelet Transform (DWT) at 7th level of decomposition with mother wavelet of Dau-bechies 2, Renyi Entropy (RE), Autoregressive Power Spectral Density (AR-PSD), Statistical,...

chapter

Implementation of ANN based speech recognition system on an embedded board

Pranjali P. Patange, John Sahaya Rani Alex

2017 International Conference on Nextgen Electronic Technologies: Silicon to Software (ICNETS2) > 408 - 412

2017 International Conference on Nextgen Electronic Technologies: Silicon to Software (ICNETS2)

Speech recognition systems are ubiquitous and find its application in automated voice control, voice dialling and automated directory assistance. This paper aims at implementing a neural network based isolated spoken word recognition system on an embedded board — Raspberry Pi using open source software called octave. Mel-Frequency Cepstral Coefficient (MFCC) features are extracted from speech signal...

chapter

Personal Identification with Face and Voice Features Extracted through Kinect Sensor

Eisuke Kita, Yi Zuo, Fumiya Saito, Xuanang Feng

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 545 - 551

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

The personal identification from the features of personal face and voice is described in this study. The face area is detected from the picture including both the face and the complicated background by using Microsoft Kinect sensor. The personal voice is also recorded from Kinect microphone array, which is used for the personal identification. The features of the personal face are calculated from...

chapter

Research on the recognition of isolated Chinese lyrics in songs with accompaniment based on deep belief networks

Juanjuan Cai, Nana Wang, Hui Wang, Bing Zhu

2016 IEEE 13th International Conference on Signal Processing (ICSP) > 535 - 540

2016 IEEE 13th International Conference on Signal Processing (ICSP)

Lyrics are an important part of songs. Lyrics recognition is the basis of retrieving songs and recognizing the content of songs, which is of great value. At present, the research of speech recognition has made great progresses. But there are still difficulties in recognition of lyrics in songs with accompaniment. Related research is generally lacking, especially for Chinese lyrics in songs with accompaniment,...

chapter

Comparison of MFCC and DWT features extractors applied to PCG classification

Mohamed Boussaa, Issam Atouf, Mohamed Atibi, Abdellatif Bennis

2016 11th International Conference on Intelligent Systems: Theories and Applications (SITA) > 1 - 5

2016 11th International Conference on Intelligent Systems: Theories and Applications (SITA)

For cardiologists, the detection of cardiac abnormalities is a very delicate and crucial task for the treatment of a patient's condition. This task that requires electronic systems of medical assistance that is more precise, faster and reliable to help cardiologists to analyze and make the right decisions. These medical assistance systems tend to model the human expertise and perception using signal...

chapter

Speech recognition using Support Vector Machines

Kamil Aida-zade, Anar Xocayev, Samir Rustamov

2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT) > 1 - 4

2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT)

In this article we applied Support Vector Machines to acoustic model of Speech Recognition System based on MFCC and LPC features for Azerbaijani DataSet. This DataSet has been used for speech recognition by Multilayer Artificial Neural Network and achieved some results. The main goal of this work is applying SVM techniques to the Azerbaijan Speech Recognition System. The variety of results of SVM...

chapter

Emotional voice conversion using deep neural networks with MCC and F0 features

Zhaojie Luo, Tetsuya Takiguchi, Yasuo Ariki

2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS) > 1 - 5

2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS)

An artificial neural network is one of the most important models for training features in a voice conversion task. Typically, Neural Networks (NNs) are not effective in processing low-dimensional F0 features, thus this causes that the performance of those methods based on neural networks for training Mel Cepstral Coefficients (MCC) are not outstanding. However, F0 can robustly represent various prosody...

chapter

Enhancing effectiveness of emotion detection by multimodal fusion of speech parameters

R. V. Darekar, A. P. Dhande

2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT) > 3242 - 3246

2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)

Speech processing is the one of the interesting and challenging concept in man machine communication. Emotion detection is the process of determination of the psychological state of the speaker. Pitch, formant frequencies, duration, timbre, MFCCs, energy are some of the efficient parameters from which, bulk of information can be retrieved from speech signal. These parameters have provided good accuracy...

chapter

Speech recognition system based on short-term cepstral parameters, feature reduction method and Artificial Neural Networks

Nawel Souissi, Adnane Cherif

2016 2nd International Conference on Advanced Technologies for Signal and Image Processing (ATSIP) > 667 - 671

2016 2nd International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)

The acoustic analysis can provide great results in the identification of voice disorders as a complementary tool to other medical techniques. This paper scrutinizes the Mel Frequency Cepstral Coefficients (MFCC), their first and second derivatives. A full comparative study is established in order to demonstrate that short-term cepstral parameters could be useful to conclude an efficient system for...

chapter

Analyzing Artificial Neural Networks and Dynamic Time Warping for spoken keyword recognition under transient noise conditions

Paulo Lopez-Meyer, Hector Cordourier-Maruri, Arturo Quinto-Martinez, Omesh Tickoo

2015 9th International Conference on Sensing Technology (ICST) > 274 - 277

2015 9th International Conference on Sensing Technology (ICST)

Spoken keyword recognition has been under the spotlight for the past several decades, but has gained significant attention in recent years due to the rapid increase in front-end technology applications for mobile and wearable computing. This work presents the trade-off in performance between Artificial Neural Networks (ANN) and Dynamic Time Warping (DTW) methodologies, implemented for this task under...

chapter

Optimization of cepstral features for robust lung sound classification

Nandini Sengupta, Md Sahidullah, Goutam Saha

2015 Annual IEEE India Conference (INDICON) > 1 - 6

2015 Annual IEEE India Conference (INDICON)

Detection of lung abnormalities by characterizing lung sounds has been a primary step for clinical examination for a pulmonologist. This work focuses on utilization of cepstral features for lung sound analysis and classification. The proposed method incorporates statistical properties of cepstral features along with artificial neural network (ANN) based classification. Experimental results indicate...

chapter

Speech-controlled human-computer interface for audio-visual breast self-examination guidance system

Robert Kerwin C. Billones, Elmer P. Dadios, Melvin K. Cabatuan, Laurence Gan Lim, more

2015 International Conference on Humanoid, Nanotechnology, Information Technology,Communication and Control, Environment and Management (HNICEM) > 1 - 6

2015 International Conference on Humanoid, Nanotechnology, Information Technology,Communication and Control, Environment and Management (HNICEM)

This paper presents the development of a speech-controlled human-computer interface (SR-HCI) as a subsystem of the audio-visual breast self-examination guidance system. This aims to better control the system during computer-guided breast self-examination (BSE) performance and allows for user indications of possible tumor locations by dictating it to the system through the speech recognition feature...

chapter

Classification of vocal and non-vocal regions from audio songs using spectral features and pitch variations

Y. V. Srinivasa Murthy, Shashidhar G. Koolagudi

2015 IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE) > 1271 - 1276

2015 IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE)

In this work, an effort has been made to identify vocal and non-vocal regions from a given song using signal processing techniques and machine learning algorithm. Initially spectral features like mel-frequency cepstral coefficients (MFCCs) are used to develop the baseline system. Statistical values of pitch, jitter and shimmer are considered to improve performance of the system. Artificial neural...

chapter

Effects of feature type, learning algorithm and speaking style for depression detection from speech

Vikramjit Mitra, Elizabeth Shriberg

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4774 - 4778

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Computational methods for speech-based detection of depression are still relatively new, and have focused on either a standard set of features or on specific additional approaches. We systematically study the effects of feature type, machine learning approach, and speaking style (read versus spontaneous) on depression prediction in the AVEC-2014 evaluation corpus, using features related to speech...

chapter

Gender identification and performance analysis of speech signals

G. S. Archana, M. Malleswari

2015 Global Conference on Communication Technologies (GCCT) > 483 - 489

2015 Global Conference on Communication Technologies (GCCT)

Abstract Speech is an important means of communication. Gender is the most significant characteristic of speech. Pitch is commonly used feature for gender classification as it differs in male and female voice. But this method is not applicable in cases where pitch of male and female is almost the same. In this paper the above limitations are rectified by extracting other features like Mel Frequency...

chapter

Classification of respiratory pathology in pulmonary acoustic signals using parametric features and artificial neural network

Rajkumar Palaniappan, Kenneth Sundaraj, Sebastian Sundaraj, N. Huliraj, more

2014 IEEE International Conference on Computational Intelligence and Computing Research > 1 - 6

2014 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC)

Pulmonary acoustic signal analysis provides essential information on the present state of the Lungs. In this paper, we intend to distinguish between normal, airway obstruction pathology and interstitial lung disease using pulmonary acoustic signal recordings. The proposed method extracts Mel frequency cepstral coefficients (MFCC) and AR Coefficients as features from pulmonary acoustic signals. The...

chapter

Classification of emphatic consonants and their counterparts in Modern Standard Arabic using neural networks

Yasser M. Seddiq, Yousef A. Alotaibi, Sid-Ahmed Selouani

2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) > 73 - 77

2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)

This paper presents the work of acoustic analysis related to Modern Standard Arabic (MSA). The problem of classifying the consonant counterparts in MSA is tackled here. The study considers four phonemes: /d^ˤ, ð^ˤ/ and their non-emphatic counterparts /d, ð/ respectively. An accurate automatic classification for those phonemes is to be achieved. Artificial neural networks (ANNs) are used for that purpose...

chapter

A Class of Neuro-computational Models to Verify Mood Variation in Dialectal Assamese Speech

Swapna Agarwalla, Kandarpa Kumar Sarma

2014 2nd International Symposium on Computational and Business Intelligence > 85 - 88

2014 2nd International Symposium on Computational and Business Intelligence (ISCBI)

Mood content in spoken word recognition is an important element in formulation of a decision support system (DSS). Many times it becomes integral components of human computer interaction (HCI) systems based on speech recognition with language orientation. In this paper, we propose a mood verification system of speakers of Assamese language with dialectal components. Five features namely Mel Frequency...

chapter

Modelling and characterization of an artificial neural network for infant cry recognition using mel-frequency cepstral coefficients

Argel A. Bandala, Allimzon M. Lim, Mark Anthony D. Cai, Allan Jeffrey C. Bacar, more

TENCON 2014 - 2014 IEEE Region 10 Conference > 1 - 6

TENCON 2014 - 2014 IEEE Region 10 Conference

This paper is about the creation of an artificial neural network (ANN) in MATLAB to analyze the features extracted from calculating the mel-frequency cepstral coefficients (MFCC) of the raw audio data. The paper explains basic concepts about the ANN, as well as the MFCC and other relevant theories. Regarding the design of the ANN, it uses multiple infant crying sounds, as well as non-crying sounds,...

Data set:
ieee
Keywords:
ARTIFICIAL NEURAL NETWORKS
FEATURE EXTRACTION
MEL FREQUENCY CEPSTRAL COEFFICIENT

Publication date

Set your own date range

Publication type

book (57)
article (1)

Keywords

SPEECH (31)
SPEECH RECOGNITION (29)
TRAINING (24)
MFCC (16)
ACCURACY (14)
HIDDEN MARKOV MODELS (13)
SPEECH PROCESSING (13)
CLASSIFICATION ALGORITHMS (11)
CEPSTRAL ANALYSIS (9)
MULTILAYER PERCEPTRONS (9)
ARTIFICIAL NEURAL NETWORK (8)
NEURAL NETS (8)
NEURAL NETWORK (8)
NEURONS (7)
PEDIATRICS (7)
SIGNAL PROCESSING (7)
SIGNAL PROCESSING ALGORITHMS (7)
COMPUTERS (6)
PATTERN CLASSIFICATION (6)
PATTERN RECOGNITION (6)
SPEAKER RECOGNITION (6)
SUPPORT VECTOR MACHINES (6)
DATA MINING (5)
DATABASES (5)
NOISE (5)
ROBUSTNESS (5)
ACOUSTICS (4)
ALGORITHM DESIGN AND ANALYSIS (4)
AUTOMATIC SPEECH RECOGNITION (4)
CEPSTRUM (4)
EDUCATIONAL INSTITUTIONS (4)
FILTER BANK (4)
MEL FREQUENCY CEPSTRAL COEFFICIENTS (4)
SUPPORT VECTOR MACHINE CLASSIFICATION (4)
TESTING (4)
TRAINING DATA (4)
VECTOR QUANTIZATION (4)
ACOUSTIC SIGNAL PROCESSING (3)
ANN (3)
AUDIO SIGNAL PROCESSING (3)
BACKPROPAGATION (3)
CLUSTERING ALGORITHMS (3)
COMPLEXITY THEORY (3)
COMPUTATIONAL MODELING (3)
DATA MODELS (3)
DISCRETE WAVELET TRANSFORMS (3)
ELECTRONIC MAIL (3)
FEATURE SELECTION (3)
HEURISTIC ALGORITHMS (3)
HMM (3)
MEL FREQUENCY CEPSTRAL COEFFICIENTS (MFCC) (3)
MEL-FREQUENCY CEPSTRAL COEFFICIENTS (3)
MULTILAYER PERCEPTRON (MLP) (3)
MUSIC (3)
NEURAL NETWORKS (3)
PATHOLOGY (3)
PRINCIPAL COMPONENT ANALYSIS (3)
SIGNAL CLASSIFICATION (3)
WRITING (3)
ACOUSTIC FEATURES (2)
ASPHYXIA (2)
AUDITORY SYSTEM (2)
BOOKS (2)
BPNN (2)
CLASSIFICATION (2)
COMPANIES (2)
CONFERENCES (2)
CORRELATION (2)
COVARIANCE MATRIX (2)
DIGITAL SIGNAL PROCESSING (2)
DISCRETE COSINE TRANSFORMS (2)
DYNAMIC TIME WARPING (2)
EIGENVALUES AND EIGENFUNCTIONS (2)
ENCODING (2)
ENGLISH LANGUAGE (2)
ENTROPY (2)
EQUATIONS (2)
FACE RECOGNITION (2)
FEATURE EXTRACTION METHOD (2)
FEED FORWARD NEURAL NETWORK (2)
FEEDFORWARD NEURAL NETS (2)
FILTER BANKS (2)
GAIN (2)
GALLIUM NITRIDE (2)
GENETIC ALGORITHM (2)
GENETIC ALGORITHMS (2)
GMM (2)
HUMANS (2)
HYPOTHYROIDISM (2)
INFORMATION SCIENCE (2)
INTERNET (2)
LEARNING (ARTIFICIAL INTELLIGENCE) (2)
LINEAR DISCRIMINANT ANALYSIS (LDA) (2)
LINEAR PREDICTIVE CODING (2)
LPC (2)
LUNGS (2)
MATHEMATICAL MODEL (2)
more

INFONA - science communication portal

Search results

An approach for self-training audio event detectors using web data

Classification of the syllables sound using wavelet, Renyi entropy and AR-PSD features

Implementation of ANN based speech recognition system on an embedded board

Personal Identification with Face and Voice Features Extracted through Kinect Sensor

Research on the recognition of isolated Chinese lyrics in songs with accompaniment based on deep belief networks

Comparison of MFCC and DWT features extractors applied to PCG classification

Speech recognition using Support Vector Machines

Emotional voice conversion using deep neural networks with MCC and F0 features

Enhancing effectiveness of emotion detection by multimodal fusion of speech parameters

Speech recognition system based on short-term cepstral parameters, feature reduction method and Artificial Neural Networks

Analyzing Artificial Neural Networks and Dynamic Time Warping for spoken keyword recognition under transient noise conditions

Optimization of cepstral features for robust lung sound classification

Speech-controlled human-computer interface for audio-visual breast self-examination guidance system

Classification of vocal and non-vocal regions from audio songs using spectral features and pitch variations

Effects of feature type, learning algorithm and speaking style for depression detection from speech

Gender identification and performance analysis of speech signals

Classification of respiratory pathology in pulmonary acoustic signals using parametric features and artificial neural network

Classification of emphatic consonants and their counterparts in Modern Standard Arabic using neural networks

A Class of Neuro-computational Models to Verify Mood Variation in Dialectal Assamese Speech

Modelling and characterization of an artificial neural network for infant cry recognition using mel-frequency cepstral coefficients

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options