Search results

Items from 1 to 20 out of 627 results

chapter

Marathi digit recognition using lip geometric shape features and dynamic time warping

Aparna Brahme, Umesh Bhadade

TENCON 2017 - 2017 IEEE Region 10 Conference > 974 - 979

TENCON 2017 - 2017 IEEE Region 10 Conference

The aim of our proposed research work is to identify language of spoken utterance using visual speech recognition and include Marathi language in language identification (LID) system. In this paper we have focused on the task of identifying first three digits in Marathi language. For this first Lips are extracted from video frames of face images and then landmark points on the lips are detected. Then...

chapter

A new speaker verification algorithm based on identification results

Khettaoui Billal, Dahimene Abdelhakim

2017 5th International Conference on Electrical Engineering - Boumerdes (ICEE-B) > 1 - 6

2017 5th International Conference on Electrical Engineering - Boumerdes (ICEE-B)

In this paper, a text independent speaker recognition system based on Gaussian mixture models (GMM) was developed with a specific focus on the use of a voice activated detector (VAD) algorithm in the training and testing. At the training level, a modified estimation/maximization (EM) algorithm is used. It is less prone to get trapped around a local maximum and so, it will have more chance to converge...

chapter

Proposal of reminiscence therapy system using spoken dialog to suppress dementia

Ryota Nishimura, Takahiro Uchiya, Takahiro Hirano, Masaru Sakurai

2017 IEEE 6th Global Conference on Consumer Electronics (GCCE) > 1 - 2

2017 IEEE 6th Global Conference on Consumer Electronics (GCCE)

The number of dementia patients has increased in recent years. The burden on caregivers has also increased. Nevertheless, no established treatment for dementia exists. Controlling dementia progression is an important goal of dementia treatment. The reminiscence method is one means of suppressing dementia progression. In the reminiscence method, a caregiver talks with a dementia patient. However, an...

chapter

Pseudo-pitch-synchronized phase information extraction and its application for robust speaker recognition

Longbiao Wang, Seiichi Nakagawa, Jianwu Dang, Jianguo Wei, more

2017 IEEE 6th Global Conference on Consumer Electronics (GCCE) > 1 - 5

2017 IEEE 6th Global Conference on Consumer Electronics (GCCE)

Recent studies have shown that phase information contains speaker-dependent characteristics and is effective for speaker recognition. In this paper, we summarize a robust phase feature extracted from Fourier spectrum (including pitch non-synchronized phase information and pseudo-pitchsynchronized phase information) and its application for speaker recognition for different speaking rate speech and...

chapter

An incremental intelligent object recognition system based on deep learning

Long Yan, Yongxiong Wang, Tianzhong Song, Zhong Yin

2017 Chinese Automation Congress (CAC) > 7135 - 7138

2017 Chinese Automation Congress (CAC)

The accuracy of object recognition has been greatly improved due to the rapid development of deep learning, but the deep learning generally requires a lot of training data and the training process is very slow and complex. We propose an incremental object recognition system based on deep learning techniques and speech recognition technology with high learning speed and wide applicability. The system...

chapter

Towards a Breakthrough Speaker Identification Approach for Law Enforcement Agencies: SIIP

Khaled Khelif, Yann Mombrun, Gerhard Backfried, Farhan Sahito, more

2017 European Intelligence and Security Informatics Conference (EISIC) > 32 - 39

2017 European Intelligence and Security Informatics Conference (EISIC)

This paper describes SIIP (Speaker Identification Integrated Project) a high performance innovative and sustainable Speaker Identification (SID) solution, running over large voice samples database. The solution is based on development, integration and fusion of a series of speech analytic algorithms which includes speaker model recognition, gender identification, age identification, language and accent...

chapter

Automated rating of recorded classroom presentations using speech analysis in kazakh

Akzharkyn Izbassarova, Aidana Irmanova, Alex Pappachen James

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 393 - 397

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Effective presentation skills can help to succeed in business, career and academy. This paper presents the design of speech assessment during the oral presentation and the algorithm for speech evaluation based on criteria of optimal intonation. As the pace of the speech and its optimal intonation varies from language to language, developing an automatic identification of language during the presentation...

chapter

Development of speech emotion recognition system using deep belief networks in malayalam language

Athira Chandran, D. Pravena, D. Govind

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 676 - 680

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

The goal of this work is to validate the impact of natural elicitation of emotions by the speakers during the development of speech emotion databases for Malayalam language. The work also proposes a Gaussian Mixture Model-Deep Belief Networks (GMM-DBN) based speech emotion recognition system. To test the effect of emotion elicitation by the speakers, two independent datasets with emotionally biased...

chapter

Significance of exploring pitch only features for the recognition of spontaneous emotions from speech signals

A. Pooja, D. Pravena, D. Govind

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 1438 - 1442

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

The emotional database can be classified as spontaneous and simulated emotions. Spontaneous emotions can be identified based on the two parameters 1) Arousal and 2) Valence values represented in a two dimensional plane. Arousal measures how calming or exciting the information is, whereas valence measures postive or negative affectivity of information. The objective of the paper is to predict the arousal...

chapter

GMM based automatic speaker verification system development for forensics in Bahasa Indonesia

Ivan Stefanus, R.S. Joko Sarwono, Miranti Indar Mandasari

2017 5th International Conference on Instrumentation, Control, and Automation (ICA) > 56 - 61

2017 5th International Conference on Instrumentation, Control, and Automation (ICA)

Speaker verification based on phonetic-acoustic approach and text-dependent framework has been applied for forensic purposes in Indonesian court since 2008. In order to accelerate the speaker verification process, an automatic text-independent system is developed. This automatic system employs MFCC features and GMM speaker modeling, a standard and simple approach used in automatic speaker recognition...

chapter

Throat microphone speech recognition using mfcc

Amritha Vijayan, Bipil Mary Mathai, Karthik Valsalan, Riyanka Raji Johnson, more

2017 International Conference on Networks & Advances in Computational Technologies (NetACT) > 392 - 395

2017 International Conference on Networks & Advances in Computational Technologies (NetACT)

The Throat Microphone (TM) is a non-acoustic device, relying on the vibrations of vocal folds rather than the audible sound produced. Correctly capturing vocal fold vibrations is difficult due to poor signal representation capabilities. The system recognizes the TM vibrations and produces the corresponding speech sound. This is done by extracting features from the spectrum of the TM vibrations and...

chapter

Feature selection in affective speech classification

Anguel Manolov, Ognian Boumbarov, Agata Manolova, Vladimir Poulkov, more

2017 40th International Conference on Telecommunications and Signal Processing (TSP) > 354 - 358

2017 40th International Conference on Telecommunications and Signal Processing (TSP)

The increasing role of spoken language interfaces in human-computer interaction applications has created conditions to facilitate a new area of research — namely recognizing the emotional state of the speaker through speech signals. This paper proposes a text independent method for emotion classification of speech signals used for the recognition of the emotional state of the speaker. Different feature...

chapter

A DC motor speed control using the LPC-ANFIS speech recognition system

Muhammad Akil, Ingrid Nurtanio, Rhiza Samsoe'oed Sadjad

2017 15th International Conference on Quality in Research (QiR) : International Symposium on Electrical and Computer Engineering > 215 - 220

2017 15th International Conference on Quality in Research (QiR) : International Symposium on Electrical and Computer Engineering

The aim of this research is to design an implementation of the speech recognition system to control the speed of a DC motor. The Linear Predictive Coding (LPC) method is used in the speed recognition system, tuned by the Adaptive Neuro-Fuzzy Inference Systems (ANFIS) method. There are 5 (five) samples of voice signals in Bahasa Indonesia recognized by this system, i.e.: “Nyala”, “Lambat”, “Sedang”,...

chapter

Efficient and Privacy-Preserving Voice-Based Search over mHealth Data

Mohammad Hadian, Thamer Altuwaiyan, Xiaohui Liang, Wei Li

2017 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE) > 96 - 101

2017 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)

In-home IoT devices play a major role in healthcare systems as smart personal assistants. They usually come with a voice-enabled feature to add an extra level of usability and convenience to elderly, disabled people, and patients. In this paper, we propose an efficient and privacy-preserving voice-based search scheme to enhance the efficiency and the privacy of in-home healthcare applications. We...

chapter

Construction of a database of emotional speech using emotion sounds from movies and dramas

Youjung Ko, Insuk Hong, Hyunsoon Shin, Yoonjoong Kim

2017 International Conference on Information and Communications (ICIC) > 266 - 267

2017 International Conference on Information and Communications (ICIC)

In this study, an emotional speech database called Hanbat Emotional Database (HEMO) was constructed using movie and drama scenes in which emotion is abundantly expressed by professional actors. HEMO consists of 454 speech samples classified into seven emotion categories such as anger, happiness, sadness, disgust, surprise, fear, and neutral. In order to evaluate the performance of HEMO, consistent...

chapter

Development of a large-scale Mandarin Radio Speech Corpus

Yung-hsiang Shawn Chang, Yuan-fu Liao, Sheng-ming Wang, Jenq-haur Wang, more

2017 IEEE International Conference on Consumer Electronics - Taiwan (ICCE-TW) > 359 - 360

2017 IEEE International Conference on Consumer Electronics - Taiwan (ICCE-TW)

The Taiwan Mandarin Radio Speech Corpus consists of roughly 300 (and growing) hours of audio recordings, selected from Taiwan's National Education Radio (NER) archive. The corpus includes speech from hundreds of speakers and various speech styles (spontaneous conversational and read news). This corpus provides a rich resource for research in speech and automatic speech recognition (ASR). In this paper,...

chapter

Recognition of positive and negative emotions for Romanian language

Silvia Monica Feraru, Marius Dan Zbancioc

2017 E-Health and Bioengineering Conference (EHB) > 725 - 728

2017 E-Health and Bioengineering Conference (EHB)

The paper presents the emotions recognition for positive and negative emotions for Romanian language. The main purpose of this study is to highlight how emotions are recognized if it is not wanted to identify with precision the expressed emotion, but the emotion in general: positive, negative or neutral. This can be useful for a human-machine interface. The positive emotions were recognized with an...

chapter

A review on speech emotion recognition: Case of pedagogical interaction in classroom

Leila Kerkeni, Youssef Serrestou, Mohamed Mbarki, Kosai Raoof, more

2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP) > 1 - 7

2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)

Emotions play a key role in cognitive processes, particularly in learning. Educators should know the emotional state of each student during a teaching activity. They must help students to experiment, interact and explore new topics and constructs. Students must feel in a state that maximize their performance. To know the emotional state of student, we need an emotion recognition system. It can be...

chapter

Detection of bypass fraud based on speaker recognition

Osama Mohamed Elrajubi, Ali Mustafa Elshawesh, Mustafa Ali Abuzaraida

2017 8th International Conference on Information Technology (ICIT) > 50 - 54

2017 8th International Conference on Information Technology (ICIT)

In telecommunication industry, fraud becomes a serious problem that affects telecommunications service providers all around the world. As a significant amount of revenue losses to fraud every year, so an efficient system to detect fraud activities is greatly required. A well-known fraud which affects GSM and PSTN service providers is Bypass fraud. It is used to avoid a charge of international calls...

chapter

Novel Applications of Complexity Inspired RDT Transform for Low Complexity Embedded Speech Recognition in Automotive Environments

Mihai Bucurica, Ioana Dogaru, Radu Dogaru

2017 21st International Conference on Control Systems and Computer Science (CSCS) > 375 - 378

2017 21st International Conference on Control Systems and Computer Science (CSCS)

Embedded dictation, i.e. recognizing vocal commands in noisy environments, with good accuracy and using low complexity implementations is a desirable task with many applications. Such applications include automotive infotainment solutions particularly when no connectivity is available, personal assistants including embedded dictation solutions for disabled people, and so on. This paper reports our...

Keywords:
DATABASES
SPEECH RECOGNITION

Publication date

Set your own date range

Content availability

Available (621)
None (6)

Keywords

SPEECH (505)
FEATURE EXTRACTION (212)
HIDDEN MARKOV MODELS (201)
EMOTION RECOGNITION (153)
TRAINING (147)
SPEECH PROCESSING (112)
MEL FREQUENCY CEPSTRAL COEFFICIENT (101)
ACCURACY (90)
ACOUSTICS (89)
SPEAKER RECOGNITION (76)
SUPPORT VECTOR MACHINES (58)
NATURAL LANGUAGE PROCESSING (49)
DATA MINING (48)
TESTING (36)
NOISE (34)
VOCABULARY (34)
MFCC (33)
AUTOMATIC SPEECH RECOGNITION (32)
COMPUTATIONAL MODELING (30)
SPEECH SYNTHESIS (30)
CEPSTRAL ANALYSIS (28)
DATA MODELS (28)
SPEECH EMOTION RECOGNITION (28)
ARTIFICIAL NEURAL NETWORKS (27)
ROBUSTNESS (27)
HIDDEN MARKOV MODEL (26)
CLASSIFICATION ALGORITHMS (25)
HMM (24)
FACE RECOGNITION (22)
TRAINING DATA (22)
GAUSSIAN PROCESSES (21)
NATURAL LANGUAGES (20)
NEURAL NETS (20)
LEARNING (ARTIFICIAL INTELLIGENCE) (19)
NOISE MEASUREMENT (19)
SPEECH CODING (19)
ALGORITHM DESIGN AND ANALYSIS (18)
COMPUTERS (18)
CONFERENCES (18)
DECODING (18)
HUMANS (18)
NEURAL NETWORKS (18)
SPEECH ANALYSIS (17)
FACE (16)
ROBOTS (16)
SERVERS (16)
SIGNAL PROCESSING (16)
STRESS (16)
VECTORS (16)
VISUALIZATION (16)
ADAPTATION MODEL (15)
CORRELATION (15)
MATHEMATICAL MODEL (15)
SPEAKER IDENTIFICATION (15)
SVM (15)
ERROR ANALYSIS (14)
GAUSSIAN MIXTURE MODEL (14)
SIGNAL TO NOISE RATIO (14)
STATISTICAL ANALYSIS (14)
ADAPTATION MODELS (13)
LABORATORIES (13)
MICROPHONES (13)
SPEECH ENHANCEMENT (13)
TRANSFORMS (13)
AUDIO DATABASES (12)
ESTIMATION (12)
GMM (12)
INFORMATION RETRIEVAL (12)
MOBILE COMMUNICATION (12)
PRINCIPAL COMPONENT ANALYSIS (12)
QUERY PROCESSING (12)
TEXT ANALYSIS (12)
AFFECTIVE COMPUTING (11)
DATABASE MANAGEMENT SYSTEMS (11)
DICTIONARIES (11)
INTERNET (11)
LABELING (11)
PITCH (11)
SPEAKER VERIFICATION (11)
SPEECH SIGNAL (11)
TELEPHONY (11)
COMPUTER AIDED INSTRUCTION (10)
CONTEXT MODELING (10)
DISCRETE COSINE TRANSFORMS (10)
EDUCATIONAL INSTITUTIONS (10)
ELECTRONIC MAIL (10)
ENGINES (10)
EQUATIONS (10)
HUMAN COMPUTER INTERACTION (10)
MACHINE LEARNING (10)
MAXIMUM LIKELIHOOD ESTIMATION (10)
PATTERN CLASSIFICATION (10)
SOFTWARE (10)
SPECTROGRAM (10)
CAMERAS (9)
DISCRETE WAVELET TRANSFORMS (9)
EDUCATION (9)
EMOTIONAL SPEECH (9)
more

INFONA - science communication portal

Search results

Marathi digit recognition using lip geometric shape features and dynamic time warping

A new speaker verification algorithm based on identification results

Proposal of reminiscence therapy system using spoken dialog to suppress dementia

Pseudo-pitch-synchronized phase information extraction and its application for robust speaker recognition

An incremental intelligent object recognition system based on deep learning

Towards a Breakthrough Speaker Identification Approach for Law Enforcement Agencies: SIIP

Automated rating of recorded classroom presentations using speech analysis in kazakh

Development of speech emotion recognition system using deep belief networks in malayalam language

Significance of exploring pitch only features for the recognition of spontaneous emotions from speech signals

GMM based automatic speaker verification system development for forensics in Bahasa Indonesia

Throat microphone speech recognition using mfcc

Feature selection in affective speech classification

A DC motor speed control using the LPC-ANFIS speech recognition system

Efficient and Privacy-Preserving Voice-Based Search over mHealth Data

Construction of a database of emotional speech using emotion sounds from movies and dramas

Development of a large-scale Mandarin Radio Speech Corpus

Recognition of positive and negative emotions for Romanian language

A review on speech emotion recognition: Case of pedagogical interaction in classroom

Detection of bypass fraud based on speaker recognition

Novel Applications of Complexity Inspired RDT Transform for Low Complexity Embedded Speech Recognition in Automotive Environments

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options