Search results

Items from 121 to 140 out of 654 results

1 ...
4
5
6
7
8
9
10

chapter

Employing speech and location information for automatic assessment of child language environments

Maryam Najafian, Dwight Irvin, Ying Luo, Beth S. Rous, more

2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE) > 1 - 5

2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE)

Assessment of the language environment of children in early childhood is a challenging task for both human and machine, and understanding the classroom environment of early learners is an essential step towards facilitating language acquisition and development. This paper explores an approach for intelligent language environment monitoring based on the duration of child-to-child and adult-to-child...

chapter

Phonetic posteriorgrams for many-to-one voice conversion without parallel data training

Lifa Sun, Kun Li, Hao Wang, Shiyin Kang, more

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

This paper proposes a novel approach to voice conversion with non-parallel training data. The idea is to bridge between speakers by means of Phonetic PosteriorGrams (PPGs) obtained from a speaker-independent automatic speech recognition (SI-ASR) system. It is assumed that these PPGs can represent articulation of speech sounds in a speaker-normalized space and correspond to spoken content speaker-independently...

chapter

Holy Qur'an speech recognition system distinguishing the type of recitation

Bilal Yousfi, Akram M. Zeki

2016 7th International Conference on Computer Science and Information Technology (CSIT) > 1 - 6

2016 7th International Conference on Computer Science and Information Technology (CSIT)

The act of reading Qur'an and pronouncing its sound dwells on the type of recitation. These are referring to the recitation of Warsh or the recitation of Hafss. It's very important to recognise the type of recitations, especially with the diversity and the spread of Qira'at in the world. This research presents a speech recognition system that distinguishes between the different types of the Qur'an...

chapter

Prosodic and voice quality features for speaker verification over coded channel

Jozef Polacky, Michal Chmulik, Roman Jarina

2016 39th International Conference on Telecommunications and Signal Processing (TSP) > 327 - 330

2016 39th International Conference on Telecommunications and Signal Processing (TSP)

Jitter and shimmer as indicators of the quality of voice are often used to detect speech disorders and a variety of narrative styles. In this article we examine the suitability of the jitter and shimmer voice quality measurements for speaker verification task. We combine these voice quality features and further prosodic features with short-term spectral features (namely MFCCs). For the purposes of...

chapter

Emotion classification using residual sinusoidal peak amplitude

Suman Deb, S. Dandapat

2016 International Conference on Signal Processing and Communications (SPCOM) > 1 - 5

2016 International Conference on Signal Processing and Communications (SPCOM)

In this work, a new feature, residual sinusoidal peak amplitude (RSPA), is proposed for emotion classification. The RSPA feature is evaluated from the LP residual of the speech signal using sinusoidal model. Residual signal is a major source of the excitation and it is expected that emotional information can be well manifested in the residual signal. The effectiveness of the proposed feature is explored...

chapter

MFCC based noise reduction in ASR using Kalman filtering

Anuradha P Nair, Shoba Krishnan, Zia Saquib

2016 Conference on Advances in Signal Processing (CASP) > 474 - 478

2016 Conference on Advances in Signal Processing (CASP)

Speech enhancement using Kalman filter is an extensively researched area. The vast majority of work done in this area uses linear predictive coding (LPC) for modeling speech signal. A few important studies have revealed the superiority of Mel Frequency Cepstral Coefficients (MFCC) over LPC for speech recognition. With this paper, the shortcomings of speech enhancement using LPC with Kalman filters...

chapter

Significance of constraining text in limited data text-independent speaker verification

Rohan Kumar Das, Sarfaraz Jelil, S. R. Mahadeva Prasanna

2016 International Conference on Signal Processing and Communications (SPCOM) > 1 - 5

2016 International Conference on Signal Processing and Communications (SPCOM)

This work projects the importance of phonetic match between train and test session for a text-independent framework under limited test data condition. The robustness of text-independent speaker verification (SV) tends to fall down with the reduction of the amount of speech involved. From a deployable application oriented system point of view, the amount of speech involved, is expected to be less to...

chapter

2-D psychoacoustic modeling for automatic speech recognition in noisy environment

Sampreeta Desai, Prasad D. Khandekar, Ketan J. Raut

2016 Conference on Advances in Signal Processing (CASP) > 129 - 132

2016 Conference on Advances in Signal Processing (CASP)

Powerful automatic speech recognition system (ASR)is matter of commercial importance as many leading companies are sprinting at industry and consumer level production. One of the major reasons for speech quality to hamper is environmental noise. Speech gets obscured by the loud background sound. This adversely affects the performance of automatic speech recognition system. We also know that human...

chapter

A novel approach for Marathi numeral recognition using Bark scale and discrete sine transform method

Gauri Ghule, Prachi Mukherji

2016 Conference on Advances in Signal Processing (CASP) > 191 - 195

2016 Conference on Advances in Signal Processing (CASP)

Marathi is spoken by the native people of Maharashtra. Spoken word recognition in Marathi is widely studied area of research. This paper describes a method for recognition of Marathi Numerals from ‘Shunya’ (zero) to ‘Nau’ (nine) using Bark scale and Discrete Sine Transform. Features extracted using Bark scale are transformed and reduced using statistical properties. A unique method for feature vector...

chapter

Modification in sequential dynamic time warping for fast computation of query-by-example spoken term detection task

Maulik C. Madhavi, Hemant A. Patil

2016 International Conference on Signal Processing and Communications (SPCOM) > 1 - 5

2016 International Conference on Signal Processing and Communications (SPCOM)

Query-by-Example Spoken Term Detection (QbE-STD) under low-resource settings, is the task of retrieval which can be done via the example of an audio. The searching phase involves highly computationally intensive Dynamic Time Warping (DTW)-based matching techniques. Search space reduction is an important need in order to reduce the space of searching and hence, reduce the computational complexity....

chapter

Low complexity language recognition exploiting ensemble of random subspace

Om Prakash Singh, Rohit Sinha

2016 International Conference on Signal Processing and Communications (SPCOM) > 1 - 5

2016 International Conference on Signal Processing and Communications (SPCOM)

The current approaches for spoken language recognition (LR) are predominantly based on GMM mean supervector as the representation of the utterances. It is assumed that the language information lies in a linear manifold of low dimensional spaces. Exploiting that a low dimensional projections of the GMM mean supervectors, known as i-vectors, are derived using a total variability matrix. The i-vector...

chapter

Music genre classification using data mining algorithm

Mangesh M. Panchwagh, Vijay D. Katkar

2016 Conference on Advances in Signal Processing (CASP) > 49 - 53

2016 Conference on Advances in Signal Processing (CASP)

Multimedia data generated and stored online is growing at a huge rate. In order to access the desired multimedia data in real time, this needs to be analyzed and stored in categorical manner. This paper presents the data mining base approach for categorization of musical data. Experiments are performed using various data mining classifiers and preprocessing methods. This paper also compares the performance...

chapter

Exploring the role of pitch-adaptive cepstral features in context of children's mismatched ASR

Rohit Sinha, S Shahnawazuddin, Patri Satya Karthik

2016 International Conference on Signal Processing and Communications (SPCOM) > 1 - 5

2016 International Conference on Signal Processing and Communications (SPCOM)

The presented work explores the role of pitch-adaptive cepstral features in context of automatic speech recognition (ASR) of children's speech on adults' speech trained acoustic models. On account of large acoustic mismatch between training and test data, highly degraded recognition rates are noted for such cases. Earlier studies have shown that the said acoustic mismatch is aided by the insufficient...

chapter

Analysis of hierarchical bottleneck framework for improved phoneme recognition

Mohammadi Zaki, Hardik B. Sailor, Hemant A. Patil

2016 International Conference on Signal Processing and Communications (SPCOM) > 1 - 5

2016 International Conference on Signal Processing and Communications (SPCOM)

In this paper, an attempt is made to examine and evaluate the effect of bottleneck and the hierarchical bottleneck (HBN) framework in MLP-based Automatic Speech Recognition (ASR) systems. In particular, the bottleneck and hierarchical bottleneck framework are analyzed using Volterra series. Experiments on several architectures with incorporation of systematic hierarchical and bottleneck properties...

chapter

Countermeasure to handle replay attacks in practical speaker verification systems

Anupama Paul, Rohan Kumar Das, Rohit Sinha, S. R. Mahadeva Prasanna

2016 International Conference on Signal Processing and Communications (SPCOM) > 1 - 5

2016 International Conference on Signal Processing and Communications (SPCOM)

In this work, a novel countermeasure is proposed for protecting the speaker verification (SV) system to replay based spoofing attacks. The replay attacks refer to the attacks made with recorded speech of a particular speaker by playing them back to the system, claiming as an authentic speaker. On analyzing live and recorded speech examples, it was noted that the low frequency contents get suppressed...

chapter

Emotional voice conversion using deep neural networks with MCC and F0 features

Zhaojie Luo, Tetsuya Takiguchi, Yasuo Ariki

2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS) > 1 - 5

2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS)

An artificial neural network is one of the most important models for training features in a voice conversion task. Typically, Neural Networks (NNs) are not effective in processing low-dimensional F0 features, thus this causes that the performance of those methods based on neural networks for training Mel Cepstral Coefficients (MCC) are not outstanding. However, F0 can robustly represent various prosody...

chapter

Influences of languages in speech emotion recognition: A comparative study using Malay, English and Mandarin languages

Rajesvary Rajoo, Ching Chee Aun

2016 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE) > 35 - 39

2016 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE)

Emotion recognition plays a significant role in affective computing and adds value to machine intelligence. While the emotional state of a person can be manifested in different ways such as facial expressions, gestures, movements and postures, recognition of emotion from speech has gathered much interest over others. However, after years of research, recognizing the emotional state of individuals...

chapter

Efficient audio segmentation in soccer videos

M A Raghuram, Nikhil R. Chavan, Shashidhar G. Koolagudi, Pravin B. Ramteke

2016 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE) > 1 - 4

2016 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)

Identifying different audio segments in videos is the first step for many important tasks such as event detection and speech transcription. Approaches using Mel-Frequency Cepstral coefficients (MFCCs) with Gaussian mixture models (GMMs) and hidden Markov models (HMMs) perform reasonably well in stationary conditions but do not scale to a broad range of environmental conditions. This paper focuses...

chapter

Speech Recognition in a Multi-speaker Environment by Using Hidden Markov Model and Mel-frequency Approach

Junzo Watada, Hanayuki

2016 Third International Conference on Computing Measurement Control and Sensor Network (CMCSN) > 80 - 83

2016 Third International Conference on Computing Measurement Control and Sensor Network (CMCSN)

The sound is a useful and versatile form of communication, where each sound have characteristics and levels of different frequency. Sound serves two basic functions for people around the world: signaling and communication. Several problems are found in sounds identifying, like pitch, velocity, and accuracy of processing voice data. The motivation of this research is to recognize and analyze human...

chapter

Analysis of Asthma by using Mel frequency cepstral coefficient

V. D. Dighore, V. R. Thool

2016 IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT) > 976 - 980

2016 IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT)

Asthma is a lung disease that affects airflow to and From the lungs. A whistling sound comes when a person suffering from asthma breathes in and out. Major symptoms of asthma are chest stiffness, breathe shortness and cough production during night and morning. In this paper, Asthma is analyze with the help of Mel frequency Cepstral Coefficient (MFCC). In this system, MFCC for Normal Voice and for...

1 ...
4
5
6
7
8
9
10

Data set:
ieee
Keywords:
FEATURE EXTRACTION
MEL FREQUENCY CEPSTRAL COEFFICIENT
SPEECH
Publication type:
book

Publication date

Set your own date range

Content availability

Available (651)
None (3)

Keywords

SPEECH RECOGNITION (353)
TRAINING (149)
HIDDEN MARKOV MODELS (147)
SPEAKER RECOGNITION (147)
MFCC (143)
DATABASES (117)
SPEECH PROCESSING (103)
SUPPORT VECTOR MACHINES (92)
ACCURACY (90)
CEPSTRAL ANALYSIS (76)
NOISE (70)
EMOTION RECOGNITION (69)
FILTER BANKS (50)
SPEAKER IDENTIFICATION (44)
GMM (42)
ROBUSTNESS (42)
GAUSSIAN MIXTURE MODEL (39)
NOISE MEASUREMENT (37)
GAUSSIAN PROCESSES (34)
MATHEMATICAL MODEL (34)
VECTORS (33)
CLASSIFICATION ALGORITHMS (32)
ARTIFICIAL NEURAL NETWORKS (31)
DATA MINING (31)
SPEAKER VERIFICATION (31)
MEL FREQUENCY CEPSTRAL COEFFICIENTS (30)
CORRELATION (28)
TESTING (27)
AUTOMATIC SPEECH RECOGNITION (26)
VECTOR QUANTIZATION (26)
MEL-FREQUENCY CEPSTRAL COEFFICIENTS (24)
SIGNAL TO NOISE RATIO (24)
SVM (24)
COMPUTATIONAL MODELING (23)
DISCRETE COSINE TRANSFORMS (23)
FILTER BANK (23)
AUDIO SIGNAL PROCESSING (22)
HIDDEN MARKOV MODEL (21)
KERNEL (20)
PRINCIPAL COMPONENT ANALYSIS (20)
SIGNAL CLASSIFICATION (20)
SUPPORT VECTOR MACHINE (20)
NATURAL LANGUAGE PROCESSING (18)
FILTERING THEORY (17)
SIGNAL PROCESSING (17)
HMM (16)
MEL-FREQUENCY CEPSTRAL COEFFICIENT (16)
MUSIC (16)
ACOUSTIC SIGNAL PROCESSING (15)
LPC (15)
NEURAL NETWORKS (15)
NIST (15)
COMPUTERS (14)
SUPPORT VECTOR MACHINE CLASSIFICATION (14)
ADAPTATION MODELS (13)
MEL FREQUENCY CEPSTRAL COEFFICIENTS (MFCC) (13)
MICROPHONES (13)
NEURAL NETWORK (13)
SPEECH CODING (13)
SPEECH EMOTION RECOGNITION (13)
SPEECH ENHANCEMENT (13)
TIME FREQUENCY ANALYSIS (13)
TRANSFORMS (13)
ALGORITHM DESIGN AND ANALYSIS (12)
DATA MODELS (12)
DISCRETE WAVELET TRANSFORMS (12)
FEATURE SELECTION (12)
GAUSSIAN MIXTURE MODELS (12)
HARMONIC ANALYSIS (12)
INDEXES (12)
LEARNING (ARTIFICIAL INTELLIGENCE) (12)
PATTERN CLASSIFICATION (12)
VECTOR QUANTISATION (12)
WAVELET TRANSFORMS (12)
ACOUSTICS (11)
CEPSTRUM (11)
CLASSIFICATION (11)
CONFERENCES (11)
NEURAL NETS (11)
ROBUST SPEECH RECOGNITION (11)
SPEAKER DIARIZATION (11)
MACHINE LEARNING (10)
PITCH (10)
SPECTRAL ANALYSIS (10)
ACOUSTIC FEATURES (9)
AUDIO CLASSIFICATION (9)
EQUATIONS (9)
ESTIMATION (9)
HEURISTIC ALGORITHMS (9)
NEURONS (9)
POLYNOMIALS (9)
SPEECH ANALYSIS (9)
SPEECH FEATURE EXTRACTION (9)
TRAINING DATA (9)
VISUALIZATION (9)
VQ (9)
ADAPTATION MODEL (8)
more

INFONA - science communication portal

Search results

Employing speech and location information for automatic assessment of child language environments

Phonetic posteriorgrams for many-to-one voice conversion without parallel data training

Holy Qur'an speech recognition system distinguishing the type of recitation

Prosodic and voice quality features for speaker verification over coded channel

Emotion classification using residual sinusoidal peak amplitude

MFCC based noise reduction in ASR using Kalman filtering

Significance of constraining text in limited data text-independent speaker verification

2-D psychoacoustic modeling for automatic speech recognition in noisy environment

A novel approach for Marathi numeral recognition using Bark scale and discrete sine transform method

Modification in sequential dynamic time warping for fast computation of query-by-example spoken term detection task

Low complexity language recognition exploiting ensemble of random subspace

Music genre classification using data mining algorithm

Exploring the role of pitch-adaptive cepstral features in context of children's mismatched ASR

Analysis of hierarchical bottleneck framework for improved phoneme recognition

Countermeasure to handle replay attacks in practical speaker verification systems

Emotional voice conversion using deep neural networks with MCC and F0 features

Influences of languages in speech emotion recognition: A comparative study using Malay, English and Mandarin languages

Efficient audio segmentation in soccer videos

Speech Recognition in a Multi-speaker Environment by Using Hidden Markov Model and Mel-frequency Approach

Analysis of Asthma by using Mel frequency cepstral coefficient

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options