Search results

Items from 121 to 140 out of 623 results

1 ...
4
5
6
7
8
9
10

chapter

Experimental framework for mel-scaled LP based Bangla speech recognition

Umme Muslima, M. Babul Islam

16th Int'l Conf. Computer and Information Technology > 56 - 59

2013 16th International Conference on Computer and Information Technology (ICCIT)

This paper deals with the recognition process of Bangla speech. The used database consists of two sets of data - one is for training containing 3824 utterances of Bangla digit sequences of 25 male and 25 female speakers and the other one is test dataset containing 1985 utterances of 26 male and 26 female speakers. The test set is subdivided into four groups such as clean1, clean2, clean3 and clean4...

chapter

Spoken Arabic Digits Recognition Using Discrete Wavelet

Mohammed Elrgaby, Abdwahab Amoura, Ali Ganoun

2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation > 275 - 279

2014 UKSim-AMSS 16th International Conference on Modelling and Simulation (UKSim)

In this paper, we propose a scheme for recognizing isolated spoken Arabic digits, based on the Discrete Wavelet Transform (DWT) features. The Discrete Wavelet Transform is a transformation that can be used to analyze the temporal and spectral properties of non-stationary signals like audio, based on the time-frequency multi-resolution property of wavelet transform. In this paper, the extracted wavelet...

chapter

MFC peak based segmentation for continuous Arabic audio signal

Mohamed S. Abdo, Ahmed H. Kandil, Sahar Ali Fawzy

2nd Middle East Conference on Biomedical Engineering > 224 - 227

2014 Middle East Conference on Biomedical Engineering (MECBME)

This paper presents an algorithm for segmenting a subset of emphatic and non-emphatic sounds automatically from continuously spoken Arabic speech. The important contribution of this paper is to generate rules for automatic segmentation of these sounds which can be extended to the rest of Arabic sounds. In addition, the findings can be used for other speech analysis problems such as data training for...

chapter

Gender specific emotion recognition through speech signals

Vinay, Shilpi Gupta, Anu Mehra

2014 International Conference on Signal Processing and Integrated Networks (SPIN) > 727 - 733

2014 International Conference on Signal Processing and Integrated Networks (SPIN)

This paper proposes an emotion recognition system which allows recognizing a person's emotional state from speech signal. The aim of proposed solution is to improve the interaction among humans and computers. The emotion recognition system must be capable of recognizing at least six basic emotions (happiness, anger, surprise, disgust, fear, sadness) and the neutral circumstances. The proposed system...

chapter

Topic dependent language modelling for spoken term detection

Shahram Kalantari, David Dean, Sridha Sridharan, Roy Wallace

2014 22nd European Signal Processing Conference (EUSIPCO) > 949 - 953

2014 22nd European Signal Processing Conference (EUSIPCO)

This paper investigates the effect of topic dependent language models (TDLM) on phonetic spoken term detection (STD) using dynamic match lattice spotting (DMLS). Phonetic STD consists of two steps: indexing and search. The accuracy of indexing audio segments into phone sequences using phone recognition methods directly affects the accuracy of the final STD system. If the topic of a document in known,...

chapter

Spectral matching based voice activity detector for improved speaker recognition

K. T. Sreekumar, Kuruvachan K. George, K. Arunraj, C. Santhosh Kumar

2014 International Conference on Power Signals Control and Computations (EPSCICON) > 1 - 4

2014 International Conference on Power Signals Control and Computations (EPSCICON)

For spoken language processing applications like speaker recognition/verification, not only that the silence segments do not contribute any speaker specific information, but also it dilutes the already available information content in the speech segments in the audio data. It has been experimentally studied that removing silence segments with the help of a voice activity detector(VAD) from the utterance...

chapter

Prosody-Based Sentence Boundary Detection of Spontaneous Speech

Nursuriati Jamil, Muhammad Izzad Ramli, Zainab Abu Bakar, Noraini Seman

2014 5th International Conference on Intelligent Systems, Modelling and Simulation > 311 - 317

2014 5th International Conference on Intelligent Systems, Modelling and Simulation (ISMS)

Sentence boundary detection (SBD), also known as sentence breaking decides where a sentence begins and ends. This paper describes sentence boundary detection using acoustic and prosodic features for spontaneous Malay language spoken audio. We introduced the addition of volume change rate to 7 prosodic features and rate-of-speech for our preliminary experiment of detecting sentence boundary. Experiments...

chapter

PARAD-R: Speech analysis software for meeting support

Andrey Ronzhin, Viktor Budkov, Irina Kipyatkova

2013 9th International Conference on Information, Communications & Signal Processing > 1 - 4

2013 9th International Conference on Information, Communications & Signal Processing (ICICS)

The main goal of the development of systems of formal logging activities is to automate the whole process of transcription of the participant speech. In this paper we outline modern methods of audio and video signal processing and personification data analysis for multimodal speaker diarization. The proposed PARAD-R software for Russian speech analysis implemented for audio speaker diarization and...

chapter

Speaker Identification Using Selected Features from DWT

Feras E. AbuAladas, Venus W. Samawi

2013 International Conference on Advanced Computer Science Applications and Technologies > 114 - 119

2013 International Conference on Advanced Computer Science Applications and Technologies (ACSAT)

The use of biometric information has been known widely for both person identification and security application. Each person can be identified by the unique characteristics of one or more of person biometrics. One of the biometric characteristics of that a person can be identified by his voice. In this research, we are interested in studying the effect of proper features that are extracted from discrete...

chapter

Tone model enhancement for low complexity tone recognition

J. Chaiwongsai, W. Chiracharit, K. Chamnongthai, Y. Miyanaga, more

2013 World Congress on Sustainable Technologies (WCST) > 60 - 65

2013 World Congress on Sustainable Technologies (WCST)

This paper proposes tone model enhancement for low complexity tone recognition. The tone model reduces the number of input frames by estimating fundamental frequency (F0) from only estimated vowel signals, called vowel magnitude difference function, vowel-MDF (VMDF). Accordingly, it reduces F0 negative influence from neighboring syllables in continuous speech. We enhance tone recognition accuracy...

chapter

A multi-parameter objective evaluation system for English sentence pronunciation

Xin-Guang Li, Su-Mei Li, Li-Rui Jiang, Sheng-Bin Zhang

2013 6th International Congress on Image and Signal Processing (CISP) > 3 > 1292 - 1297

2013 6th International Congress on Image and Signal Processing (CISP)

In this paper, the study of methods of multi-parameter objective evaluation of pronunciation is introduced. The accuracy of sentences, the emotional expression, the volume matching degree, tone, speaking rate and rhythm are selected as the parameters for evaluating an English sentence. As the result of the evaluation, an objective rating of the input voice, as well as the feedback are presented to...

chapter

Remote spoken document retrieval using foreground speech segmentation based isolated word recognizer

K. T. Deepak, S. R. Mahadeva Prasanna

2013 Annual IEEE India Conference (INDICON) > 1 - 4

2013 Annual IEEE India Conference (INDICON)

This work describes the development of a scheme for retrieving spoken documents in a remote fashion stored on a voice server. The spoken documents are recorded and indexed based on the frequency of occurrence of isolated keywords and are stored on the voice server. An isolated word recognizer (IWR) is developed for recognizing the identified keywords spoken in isolated fashion. The IWR employs foreground...

chapter

The challenges of SVM optimization using Adaboost on a phoneme recognition problem

Rimah Amami, Dorra Ben Ayed, Noureddine Ellouze

2013 IEEE 4th International Conference on Cognitive Infocommunications (CogInfoCom) > 463 - 468

2013 IEEE 4th International Conference on Cognitive Infocommunications (CogInfoCom)

The use of digital technology is growing at a very fast pace which led to the emergence of systems based on the cognitive infocommunications. The expansion of this sector impose the use of combining methods in order to ensure the robustness in cognitive systems.

chapter

Using Adaboost Algorithm along with Artificial neural networks for efficient human emotion recognition from speech

Jasdeep Singh Bhalla, Anmol Aggarwal

2013 International Conference on Control, Automation, Robotics and Embedded Systems (CARE) > 1 - 6

2013 International Conference on Control, Automation, Robotics and Embedded Systems (CARE)

Emotion Recognition from speech has evolved itself as the most significant research area in the field of affective computing. In this paper, two emotional speech datasets, have been analyzed, based on gender distinction (male and female speech). This paper introduces a new approach of speech-emotion recognition based on the use of AdaBoost classification Algorithm. Artificial neural network has been...

chapter

Isolated digit recognition for Malayalam- An application perspective

Renjith S., Aju Joseph, Anish Babu K.K.

2013 International Conference on Control Communication and Computing (ICCC) > 190 - 193

2013 International Conference on Control Communication and Computing (ICCC)

Speech recognition is one of the promising technologies of the future. Voice user interfaces play an important role in many real world applications. This paper presents speaker independent isolated digit recognition for Malayalam language and reveals some application areas of digit recognition. Mel-Frequency Cepstral Coefficient(MFCC) is used as feature and Hidden Markov Model(HMM) is used as the...

chapter

Classification of emotional speech units in call centre interactions

Dimitrios Galanis, Sotiris Karabetsos, Maria Koutsombogera, Harris Papageorgiou, more

2013 IEEE 4th International Conference on Cognitive Infocommunications (CogInfoCom) > 403 - 406

2013 IEEE 4th International Conference on Cognitive Infocommunications (CogInfoCom)

Detecting emotional traits in call centre interactions can be beneficial to the quality management of the services provided, since this reveals the positioning of both speakers, i.e. satisfaction or frustration and anger on the customers' side, and stress detection, disappointment mitigation or failure to provide the requested service on the operators' side. This paper describes a machine learning...

chapter

A syllable-based Turkish speech recognition system by using time delay neural networks (TDNNs)

Burcu Can, Harun Artuner

2013 International Conference on Soft Computing and Pattern Recognition (SoCPaR) > 219 - 224

2013 International Conference of Soft Computing and Pattern Recognition (SoCPaR)

In this paper, we present a model for Turkish speech recognition. The model is syllable-based, where the recognition is performed through syllables as speech recognition units. The main goal of the model is to recognize as much as possible of a given continuous speech by identifying only a small set of syllables in the language. For that purpose, only the syllable types with a higher frequency are...

chapter

Improvement of Lip Reading Performance in Real Environments Using Speaker and Environmental Adaptation

Takuya Kawasaki, Naoya Ukai, Seko Takumi, Satoshi Tamura, more

2013 2nd IAPR Asian Conference on Pattern Recognition > 346 - 350

2013 2nd IAPR Asian Conference on Pattern Recognition (ACPR)

Lip reading technologies play a great role not only in image pattern recognition e.g. computer vision, but also in audio-visual pattern recognition e.g. bimodal speech recognition. However, it is a problem that the recognition accuracy is still significantly low, compared to that of speech recognition. Another problem lies which the performance degradation occurs in real environments. To improve the...

chapter

Vietnamese speech recognition using Dynamic Time Warping and Coefficient of Correlation

Vu Duc Lung, Vu N. Truong

2013 International Conference on Control, Automation and Information Sciences (ICCAIS) > 64 - 67

2013 International Conference on Control, Automation and Information Sciences (ICCAIS)

The hidden Markov model is supposed as the most common and effective method used in speech recognition for all languages including Vietnamese. However, this method is quite cumbersome and difficult to implement in many embedded systems that have limited resources. Dynamic Time Warping (DTW) method, whereas, has been in much study by many scientists and is proved to be simple and efficient for a relatively...

chapter

Building speakers' vowel models and its application in text independent speaker verification

Jia-Guu Leu, Liang-tsair Geeng, Chang En Pu, Jyh-Bin Shiau

2013 47th International Carnahan Conference on Security Technology (ICCST) > 1 - 6

2013 International Carnahan Conference on Security Technology (ICCST)

In text-independent speaker verification, we compare two sets of sentences with different text content for their tonal similarity to determine if they were due to the same speaker. Since the sentences are different, we may not have matching words to compare. However, the sentences are constructed from the same set of phonemes of the language used, including vowels and consonants. Generally speaking,...

1 ...
4
5
6
7
8
9
10

Keywords:
ACCURACY
SPEECH RECOGNITION

Publication date

Set your own date range

Content availability

Available (615)
None (8)

Keywords

SPEECH (465)
HIDDEN MARKOV MODELS (250)
FEATURE EXTRACTION (187)
TRAINING (187)
ACOUSTICS (113)
DATABASES (90)
MEL FREQUENCY CEPSTRAL COEFFICIENT (88)
SPEECH PROCESSING (79)
AUTOMATIC SPEECH RECOGNITION (62)
EMOTION RECOGNITION (62)
NOISE (60)
SUPPORT VECTOR MACHINES (57)
NATURAL LANGUAGE PROCESSING (53)
ARTIFICIAL NEURAL NETWORKS (47)
DATA MINING (44)
SPEAKER RECOGNITION (40)
COMPUTATIONAL MODELING (34)
DECODING (33)
CLASSIFICATION ALGORITHMS (32)
ROBUSTNESS (32)
NOISE MEASUREMENT (31)
VOCABULARY (31)
TRAINING DATA (30)
HIDDEN MARKOV MODEL (29)
DATA MODELS (28)
CONTEXT (26)
SIGNAL TO NOISE RATIO (26)
TESTING (26)
VECTORS (26)
NEURAL NETWORKS (24)
MFCC (23)
ADAPTATION MODEL (22)
LEARNING (ARTIFICIAL INTELLIGENCE) (22)
CEPSTRAL ANALYSIS (21)
MATHEMATICAL MODEL (21)
SPEECH CODING (21)
CORRELATION (20)
HUMANS (20)
VISUALIZATION (20)
DICTIONARIES (19)
HMM (19)
LATTICES (19)
NATURAL LANGUAGES (19)
NEURAL NETS (19)
PATTERN CLASSIFICATION (19)
PROBABILITY (19)
ESTIMATION (18)
SPEAKER IDENTIFICATION (18)
SPEECH ENHANCEMENT (18)
ROBOTS (17)
COMPUTERS (16)
ERROR ANALYSIS (16)
MICROPHONES (16)
SIGNAL PROCESSING (16)
SUPPORT VECTOR MACHINE (16)
ACOUSTIC SIGNAL PROCESSING (15)
ENTROPY (15)
INFORMATION RETRIEVAL (15)
PATTERN RECOGNITION (15)
ROBUST SPEECH RECOGNITION (15)
STATISTICAL ANALYSIS (15)
SUPPORT VECTOR MACHINE CLASSIFICATION (15)
ALGORITHM DESIGN AND ANALYSIS (14)
EQUATIONS (14)
FACE RECOGNITION (14)
GAUSSIAN PROCESSES (14)
LANGUAGE MODEL (14)
OPTIMIZATION (14)
ACOUSTIC MODELING (13)
CONFERENCES (13)
CONTEXT MODELING (13)
INDEXES (13)
KERNEL (13)
LABELING (13)
SPEECH ANALYSIS (13)
DISCRETE WAVELET TRANSFORMS (12)
IMAGE SEGMENTATION (12)
INDEXING (12)
MACHINE LEARNING (12)
PHONEME RECOGNITION (12)
PRINCIPAL COMPONENT ANALYSIS (12)
SVM (12)
TRANSFORMS (12)
ADAPTATION MODELS (11)
CHARACTER RECOGNITION (11)
EDUCATIONAL INSTITUTIONS (11)
HUMAN COMPUTER INTERACTION (11)
MULTILAYER PERCEPTRONS (11)
SIGNAL CLASSIFICATION (11)
SPECTRAL ANALYSIS (11)
DISCRIMINATIVE TRAINING (10)
MAXIMUM LIKELIHOOD ESTIMATION (10)
SPECTROGRAM (10)
SPEECH SYNTHESIS (10)
ACOUSTIC MODEL (9)
ASR (9)
CLASSIFICATION (9)
DECISION TREES (9)
more

INFONA - science communication portal

Search results

Experimental framework for mel-scaled LP based Bangla speech recognition

Spoken Arabic Digits Recognition Using Discrete Wavelet

MFC peak based segmentation for continuous Arabic audio signal

Gender specific emotion recognition through speech signals

Topic dependent language modelling for spoken term detection

Spectral matching based voice activity detector for improved speaker recognition

Prosody-Based Sentence Boundary Detection of Spontaneous Speech

PARAD-R: Speech analysis software for meeting support

Speaker Identification Using Selected Features from DWT

Tone model enhancement for low complexity tone recognition

A multi-parameter objective evaluation system for English sentence pronunciation

Remote spoken document retrieval using foreground speech segmentation based isolated word recognizer

The challenges of SVM optimization using Adaboost on a phoneme recognition problem

Using Adaboost Algorithm along with Artificial neural networks for efficient human emotion recognition from speech

Isolated digit recognition for Malayalam- An application perspective

Classification of emotional speech units in call centre interactions

A syllable-based Turkish speech recognition system by using time delay neural networks (TDNNs)

Improvement of Lip Reading Performance in Real Environments Using Speaker and Environmental Adaptation

Vietnamese speech recognition using Dynamic Time Warping and Coefficient of Correlation

Building speakers' vowel models and its application in text independent speaker verification

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options