Search results

Items from 41 to 60 out of 1,468 results

chapter

Segmentation the speech of hard of hearing children

Laszlo Czap, Judit Maria Pinter, Attila K. Varga

2017 18th International Carpathian Control Conference (ICCC) > 446 - 450

2017 18th International Carpathian Control Conference (ICCC)

One service provided by our application ‘Speech Assistant System’ assisting the teaching of the hearing impaired to speak is the automatic assessment of words and sentences in the course of practice and feedback to the person. Individual speech sounds can only be correctly evaluated if they are compared with the appropriate reference speech sounds. This requires segmenting the speech to be examined...

chapter

A Novel Concept of the Rehabilitation Training Coach Robot for Patients with Disability

Seung-Ho Han, Han-Gyu Kim, Ho-Jin Choi

2017 18th IEEE International Conference on Mobile Data Management (MDM) > 376 - 381

2017 18th IEEE International Conference on Mobile Data Management (MDM)

This paper proposes the rehabilitation treatment coach robot which will help at-home patients do their rehabilitation exercises at home without any professional trainers. The coach robot is designed to be cheap enough for patients to afford it. The robot suggests the rehabilitation program and corrects the posture of the patients during the exercise. The deep neural network is used for posture correction...

chapter

A study of support vector machines for emotional speech recognition

Nattapong Kurpukdee, Sawit Kasuriya, Vataya Chunwijitra, Chai Wutiwiwatchai, more

2017 8th International Conference of Information and Communication Technology for Embedded Systems (IC-ICTES) > 1 - 6

2017 8th International Conference of Information and Communication Technology for Embedded Systems (IC-ICTES)

In this paper, efficiency comparison of Support Vector Machines (SVM) and Binary Support Vector Machines (BSVM) techniques in utterance-based emotion recognition is studied. Acoustic features including energy, Mel-frequency cepstral coefficients (MFCC), Perceptual linear predictive (PLP), Filter bank (FBANK), pitch, their first and second derivatives are used as frame-based features. Four basic emotions...

chapter

Towards intoxicated speech recognition

Zixing Zhang, Felix Weninger, Martin Wollmer, Jing Han, more

2017 International Joint Conference on Neural Networks (IJCNN) > 1555 - 1559

2017 International Joint Conference on Neural Networks (IJCNN)

In a real-life scenario, the acoustic characteristics of speech often suffer from the variations induced by diverse environmental noises and different speakers. To overcome the speaker-related speech variation problem for Automatic Speech Recognition (ASR), many speaker adaptation techniques have been proposed and studied. Almost all of these studies, however, only considered the speakers' long-term...

chapter

On the use of deep recurrent neural networks for detecting audio spoofing attacks

Simone Scardapane, Lucas Stoffl, Florian Rohrbein, Aurelio Uncini

2017 International Joint Conference on Neural Networks (IJCNN) > 3483 - 3490

2017 International Joint Conference on Neural Networks (IJCNN)

Biometric security systems based on predefined speech sentences are extremely common nowadays, particularly in low-cost applications where the simplicity of the hardware involved is a great advantage. Audio spoofing verification is the problem of detecting whether a speech segment acquired from such a system is genuine, or whether it was synthesized or modified by a computer in order to make it sound...

chapter

Symbolic manipulation based on deep neural networks and its application to axiom discovery

Cheng-Hao Cai, Dengfeng Ke, Yanyan Xu, Kaile Su

2017 International Joint Conference on Neural Networks (IJCNN) > 2136 - 2143

2017 International Joint Conference on Neural Networks (IJCNN)

Symbolic reasoning is difficult for neural networks. Especially, reasoning with variables can be a challenging task for them. In this paper, a symbolic reasoning method based on deep neural networks is proposed, and this method is applied to axiom discovery. This method makes use of the concept of “symbolic manipulation”. Specifically, it relies on the learning ability of the deep neural networks...

chapter

Improved speaker recognition system for stressed speech using deep neural networks

Sri Harsha Dumpala, Sunil Kumar Kopparapu

2017 International Joint Conference on Neural Networks (IJCNN) > 1257 - 1264

2017 International Joint Conference on Neural Networks (IJCNN)

Good speaker recognition systems should identify the speaker irrespective of what is spoken, including non-speech sounds that are often produced during natural conversations. In this work, the inclusion of breath sounds in the training phase of the speaker recognition is analyzed using the popular Gaussian mixture model-universal background model (GMM-UBM) and deep neural network (DNN) based systems...

chapter

Speech-based emotion recognition and next reaction prediction

Fatemeh Noroozi, Neda Akrami, Gholamreza Anbarjafari

2017 25th Signal Processing and Communications Applications Conference (SIU) > 1 - 4

2017 25th Signal Processing and Communications Applications Conference (SIU)

Communication through voice is one of the main components of affective computing in human-computer interaction. In this type of interaction, properly comprehending the meanings of the words or the linguistic category and recognizing the emotion included in the speech is essential for enhancing the performance. In order to model the emotional state, the speech waves are utilized, which bear signals...

chapter

Identification of asphyxia in newborns using gpu for deep learning

Minal Moharir, M U Sachin, Rishab Nagaraj, M Samiksha, more

2017 2nd International Conference for Convergence in Technology (I2CT) > 236 - 239

2017 2nd International Conference for Convergence in Technology (I2CT)

With the rapid advancement in technology, we still observe a significant amount of deaths of children under the age of five years. Majority of these deaths worldwide can be attributed to various medical conditions out of which three are very significant: birth asphyxia, preterm and infections. Birth asphyxia (perinatal asphyxia) is a medical condition which is characterised by abnormal breathing patterns...

chapter

Text dependent voice recognition system using MFCC and VQ for security applications

Ashwin Nair Anil Kumar, Senthil Arumugam Muthukumaraswamy

2017 International conference of Electronics, Communication and Aerospace Technology (ICECA) > 2 > 130 - 136

2017 International conference of Electronics, Communication and Aerospace Technology (ICECA)

This paper presents the implementation of a practical voice recognition system using MATLAB (R2014b) to secure a given user's system so that only the user may access it. Voice recognition systems have two phases, training and testing. During the training phase, the characteristic features of the speaker are extracted from the speech signal and stored in a database. In the testing phase, the stored...

chapter

Lip-reading via a DNN-HMM hybrid system using combination of the image-based and model-based features

Mohammad Hasan Rahmani, Farshad Almasganj

2017 3rd International Conference on Pattern Recognition and Image Analysis (IPRIA) > 195 - 199

2017 3rd International Conference on Pattern Recognition and Image Analysis (IPRIA)

Introducing features that better represent the visual information of speakers during the speech production is still an open issue that highly affects the quality of the lip-reading and Audio Visual Speech Recognition (AVSR) tasks. In this paper, three different types of visual features from both the image-based and model-based ones are investigated inside a professional lip reading task. The simple...

chapter

The phoneme set influence for lithuanian speech commands recognition accuracy

Mindaugas Greibus, Zivile Ringeliene, Laimutis Telksnys

2017 Open Conference of Electrical, Electronic and Information Sciences (eStream) > 1 - 4

2017 Open Conference of Electrical, Electronic and Information Sciences (eStream)

The phoneme set influence for Lithuanian speech commands recognition accuracy is investigated. Four phoneme sets are discussed. LIEPA speech corpus for training of Acoustic Model is used. The phonetic representation of corpus transcriptions is generated by grapheme-to-phoneme transformation rules. Rule based transformations for Lithuanian language is proposed. Recognition engine with CMU Pocketsphinx...

chapter

Research on multi-base depth neural network speech recognition

Cai Jun, Li Fei, Zhang Yi, Liu Yu

2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) > 1540 - 1544

2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)

In speech recognition system, an improved multi-base neural network speech recognition model is proposed to solve the problem of long learning time and slow convergence rate of deep neural network. However, the improved model introduces a large number of parameters in the training process to make the model over-fitted in the test set, resulting in the deterioration of generalization ability and the...

chapter

Fast speech keyword recognition based on improved filler model

Yang Wang, Jie Yang, Le Zhang

2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) > 530 - 534

2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)

Most traditional template matching based keyword recognition methods don't need training data, just rely on frame matching. However, the recognition speed is relatively slow and it can't be used in practice. The LVCSR-based method needs to convert the speech signal into text signal before recognition, which has an important impact on the final recognition performance. In this paper, we propose a method...

chapter

Investigative study of various activation functions for speech recognition

Hari Krishna Vydana, Anil Kumar Vuppala

2017 Twenty-third National Conference on Communications (NCC) > 1 - 5

2017 Twenty-third National Conference on Communications (NCC)

Significant developments in deep learning methods have been achieved with the capability to train more deeper networks. The performance of speech recognition system has been greatly improved by the use of deep learning techniques. Most of the developments in deep learning are associated with the development of new activation functions and the corresponding initializations. The development of Rectified...

chapter

Addressing data sparsity in DNN acoustic modeling

Seeram Tejaswi, S Umesh

2017 Twenty-third National Conference on Communications (NCC) > 1 - 5

2017 Twenty-third National Conference on Communications (NCC)

This paper presents our work on developing acoustic models using deep neural networks (DNN) for low resource languages. This is considered one of the challenging problems in automatic speech recognition (ASR) as DNNs need large amount of data for building efficient models. The techniques explored in this approach use a common idea of transferring knowledge from models of high resource language to...

chapter

Speech signals identification base on improved DBN

Cai Jun, Yao Qin, Zhang Yi

2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) > 1144 - 1148

2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)

For the problem low speech recognition rate, an improved method of combining Deep Belief Network (DBN) with support vector machine (SVM) for analyzing Small sample speech signals is proposed. The speech signal data collected as the training sample is used for training the DBN to get the optimal parameter values. The trained DBN is utilized for feature extraction, and these speech sample data signals...

chapter

Multidimensional speaker information recognition based on proposed baseline system

Shan Li, Longting Xu, Zhen Yang

2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) > 1776 - 1780

2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)

Traditional speech-related identity recognition commonly pays attention to individual aspect of speech signals but in reality, the speech signals are made up of semantics, speaker dependent features, etc. This paper therefore presents a new study that recognizes simultaneously multidimensional speaker information. In order to extract sufficient relational features, both high-level and low-level features...

chapter

DNN acoustic models for dysarthric speech

Seeram Tejaswi, S Umesh

2017 Twenty-third National Conference on Communications (NCC) > 1 - 4

2017 Twenty-third National Conference on Communications (NCC)

In this paper, we investigate various training methods for building deep neural network (DNN) based acoustic models for dysarthric speech data. Methods like multitask learning, knowledge distillation and model adaptation, which overcome data sparsity and model over-fitting problems are employed to study the merits of each method. In Knowledge distillation framework, some privilege information in addition...

chapter

Acoustic and language modeling for children's read speech assessment

Hitesh Tulsiani, Prakhar Swarup, Preeti Rao

2017 Twenty-third National Conference on Communications (NCC) > 1 - 6

2017 Twenty-third National Conference on Communications (NCC)

Automatic speech recognition can be used to evaluate the accuracy of read speech and thus serve a valuable role in literacy development by providing the needed feedback on reading skills in the absence of qualified teachers. Given the known limitations of ASR in the face of insufficient task-specific training data, the selection of acoustic and language modeling strategies can play a crucial role...

Keywords:
TRAINING
SPEECH RECOGNITION

Publication date

Set your own date range

Content availability

Available (1,462)
None (6)

Keywords

SPEECH (1,071)
HIDDEN MARKOV MODELS (756)
ACOUSTICS (384)
FEATURE EXTRACTION (377)
ACCURACY (187)
SPEECH PROCESSING (172)
MEL FREQUENCY CEPSTRAL COEFFICIENT (160)
DATABASES (147)
DATA MODELS (137)
SPEAKER RECOGNITION (136)
NEURAL NETWORKS (132)
ARTIFICIAL NEURAL NETWORKS (130)
COMPUTATIONAL MODELING (126)
NATURAL LANGUAGE PROCESSING (124)
TRAINING DATA (123)
SUPPORT VECTOR MACHINES (117)
AUTOMATIC SPEECH RECOGNITION (116)
TESTING (100)
VOCABULARY (91)
EMOTION RECOGNITION (90)
DATA MINING (87)
MATHEMATICAL MODEL (86)
ADAPTATION MODELS (85)
HIDDEN MARKOV MODEL (82)
DECODING (79)
ADAPTATION MODEL (78)
NOISE (78)
LEARNING (ARTIFICIAL INTELLIGENCE) (74)
ERROR ANALYSIS (69)
HMM (65)
CONTEXT (64)
CLASSIFICATION ALGORITHMS (57)
MAXIMUM LIKELIHOOD ESTIMATION (57)
GAUSSIAN PROCESSES (55)
PATTERN CLASSIFICATION (54)
LATTICES (53)
NEURAL NETS (53)
ROBUSTNESS (49)
CEPSTRAL ANALYSIS (48)
MFCC (48)
NOISE MEASUREMENT (48)
VECTORS (48)
DISCRIMINATIVE TRAINING (45)
PROBABILITY (45)
MACHINE LEARNING (43)
OPTIMIZATION (43)
STATISTICAL ANALYSIS (43)
KERNEL (41)
DICTIONARIES (40)
RECURRENT NEURAL NETWORKS (40)
TRANSFORMS (39)
SIGNAL TO NOISE RATIO (38)
DEEP NEURAL NETWORK (37)
LANGUAGE MODEL (37)
ACOUSTIC MODELING (36)
CONTEXT MODELING (36)
CORRELATION (36)
DEEP NEURAL NETWORKS (35)
NEURONS (35)
VISUALIZATION (34)
ENTROPY (33)
SUPPORT VECTOR MACHINE (33)
NATURAL LANGUAGES (32)
ACOUSTIC SIGNAL PROCESSING (31)
EQUATIONS (31)
SPEECH CODING (31)
SPEECH SYNTHESIS (30)
SPEECH ENHANCEMENT (29)
SUPPORT VECTOR MACHINE CLASSIFICATION (29)
GAUSSIAN MIXTURE MODEL (28)
VECTOR QUANTIZATION (28)
ESTIMATION (27)
NIST (27)
PATTERN RECOGNITION (27)
ROBUST SPEECH RECOGNITION (27)
SIGNAL CLASSIFICATION (27)
COMPUTERS (26)
SPEAKER IDENTIFICATION (25)
ACOUSTIC MODEL (24)
ALGORITHM DESIGN AND ANALYSIS (24)
COMPUTER ARCHITECTURE (24)
HUMANS (24)
PRINCIPAL COMPONENT ANALYSIS (24)
SPEECH EMOTION RECOGNITION (24)
DETECTORS (23)
STANDARDS (23)
COVARIANCE MATRIX (22)
MULTILAYER PERCEPTRONS (22)
TEXT ANALYSIS (22)
VITERBI ALGORITHM (22)
CLUSTERING ALGORITHMS (21)
LANGUAGE MODELING (21)
NEURAL NETWORK (21)
SIGNAL PROCESSING (21)
SPEAKER VERIFICATION (20)
WORD ERROR RATE (20)
ASR (19)
CONFERENCES (19)
more

INFONA - science communication portal

Search results

Segmentation the speech of hard of hearing children

A Novel Concept of the Rehabilitation Training Coach Robot for Patients with Disability

A study of support vector machines for emotional speech recognition

Towards intoxicated speech recognition

On the use of deep recurrent neural networks for detecting audio spoofing attacks

Symbolic manipulation based on deep neural networks and its application to axiom discovery

Improved speaker recognition system for stressed speech using deep neural networks

Speech-based emotion recognition and next reaction prediction

Identification of asphyxia in newborns using gpu for deep learning

Text dependent voice recognition system using MFCC and VQ for security applications

Lip-reading via a DNN-HMM hybrid system using combination of the image-based and model-based features

The phoneme set influence for lithuanian speech commands recognition accuracy

Research on multi-base depth neural network speech recognition

Fast speech keyword recognition based on improved filler model

Investigative study of various activation functions for speech recognition

Addressing data sparsity in DNN acoustic modeling

Speech signals identification base on improved DBN

Multidimensional speaker information recognition based on proposed baseline system

DNN acoustic models for dysarthric speech

Acoustic and language modeling for children's read speech assessment

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options