Search results

Items from 81 to 100 out of 970 results

chapter

Sequential forward feature selection with low computational cost

Dimitrios Ververidis, Constantine Kotropoulos

2005 13th European Signal Processing Conference > 1 - 4

2005 13th European Signal Processing Conference

This paper presents a novel method to control the number of crossvalidation repetitions in sequential forward feature selection algorithms. The criterion for selecting a feature is the probability of correct classification achieved by the Bayes classifier when the class feature probability density function is modeled by a single multivariate Gaussian density. Let the probability of correct classification...

chapter

Error handling in multimodal biometric systems using reliability measures

Krzysztof Kryszczuk, Jonas Richiardi, Plamen Prodanov, Andrzej Drygajlo

2005 13th European Signal Processing Conference > 1 - 4

2005 13th European Signal Processing Conference

In this paper, we present a framework for predicting and correcting classification decision errors based on modality reliability measures in a multimodal biometric system. In our experiments we use face and speech experts based on a recently proposed framework which uses Bayesian networks. The expert decisions and the accompanying information on their reliability are combined in a decision module...

chapter

Combining POS taggers in master-slaves technique for highly inflected languages as Arabic

Ahmed H. Aliwy

2015 International Conference on Cognitive Computing and Information Processing(CCIP) > 1 - 5

2015 International Conference on Cognitive Computing and Information Processing (CCIP)

Part Of Speech tagging (POS) is the basic process for almost all natural language processing (NLP) applications. The typical methods for combining different taggers, the program doing POS tagging, are voting or stacking techniques. We propose here a Master-Slaves Technique, which can combine Hidden Markov Model (HMM) tagger as master and any number of other taggers of any type as slaves. We describe...

chapter

The effects of listening agent in speech-based on-line test system

Hidemasa Kimura, Jumpei Hayashi, Yuichi Demise, Dai Hasegawa, more

2015 IEEE Global Engineering Education Conference (EDUCON) > 366 - 370

2015 IEEE Global Engineering Education Conference (EDUCON)

Speech-based test is a one of the best ways to train students' level of proficiency. However, in e-learning it is difficult for students to maintain their concentration when speaking to a monitor. We propose speech-based online test system employing a human-likely embodied graphical agent as a listener. The developed system consists of three modules, user recognition, agent's non-verbal behavior realization,...

chapter

Voice pathology detection with MDVP parameters using Arabic voice pathology database

Ahmed Al-nasheri, Zulfiqar Ali, Ghulam Muhammad, Mansour Alsulaiman, more

2015 5th National Symposium on Information Technology: Towards New Smart World (NSITNSW) > 1 - 5

2015 5th National Symposium on Information Technology: Towards New Smart World (NSITNSW)

This paper investigates the use of MultiDimensional Voice Program (MDVP) parameters to automatically detect voice pathology in Arabic voice pathology database (AVPD). MDVP parameters are very popular among the physician / clinician to detect voice pathology; however, MDVP is a commercial software. AVPD is a newly developed speech database designed to suit a wide range of experiments in the field of...

chapter

An efficient visualized clustering approach (VCA) for various datasets

K. Rajendra Prasad, B. Eswara Reddy

2015 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES) > 1 - 5

2015 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES)

The aim of clustering is to discover the clusters based on the similarity features of objects. The present algorithm of visual access tendency (VAT) can access an exact number of clusters by its VAT image. The VAT image displays the squared shaped dark blocks along the diagonal; number of cluster information is accessed by counting the number of obtaining square blocks. Other extended versions are...

chapter

Data-driven pause prediction for speech synthesis in storytelling style speech

Parakrant Sarkar, K. Sreenivasa Rao

2015 Twenty First National Conference on Communications (NCC) > 1 - 5

2015 Twenty First National Conference on Communications (NCC)

In the storyteller speech, pauses plays a significant role in introducing suspense and climax. Pauses are used to emphasize keywords, emotion-salient words and separate the phrases in the utterance. The objective of this work is to predict the position and duration of the pauses in the synthesized speech from the text-to-speech system. We analyzed the pause patterns in storyteller speech and classified...

chapter

Automatic gender classification using the mel frequency cepstrum of neutral and whispered speech: A comparative study

G. Nisha Meenakshi, Prasanta Kumar Ghosh

2015 Twenty First National Conference on Communications (NCC) > 1 - 6

2015 Twenty First National Conference on Communications (NCC)

A whispered speech resembles an unvoiced speech due to the lack of vocal fold vibration unlike the neutral speech. Since information about the gender of a speaker typically lies in the pitch resulted from the vocal fold vibration (or source signal), identifying gender from the whispered speech is more challenging compared to that from the neutral speech. In the absence of the pitch, we study the use...

chapter

Speaker based Language Independent Isolated Speech Recognition System

Shanthi Therese S., Chelpa Lingam

2015 International Conference on Communication, Information & Computing Technology (ICCICT) > 1 - 7

2015 International Conference on Communication, Information & Computing Technology (ICCICT)

This paper presents a speaker based Language Independent Isolated Speech Recognition System (LIISRS). The most popular feature extraction technique Mel Frequency Cepstral Coefficients (MFCC) is used for training the system. Representative specific features are identified using K-Means algorithm. Distortion measure is calculated using Euclidian distance function. Pitch contour characteristics are used...

chapter

A unique approach in text independent speaker recognition using MFCC feature sets and probabilistic neural network

Khan Suhail Ahmad, Anil S. Thosar, Jagannath H. Nirmal, Vinay S. Pande

2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR) > 1 - 6

2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR)

This paper motivates the use of combination of mel frequency cepstral coefficients (MFCC) and its delta derivatives (DMFCC and DDMFCC) calculated using mel spaced Gaussian filter banks for text independent speaker recognition. MFCC modeled on the human auditory system shows robustness against noise and session changes and hence has become synonymous with speaker recognition. Our main aim is to test...

chapter

Two-stage phone recognition system using articulatory and spectral features

K E Manjunath, K. Sreenivasa Rao, M Gurunath Reddy

2015 International Conference on Signal Processing and Communication Engineering Systems > 107 - 111

2015 International Conference on Signal Processing And Communication Engineering Systems (SPACES)

In this paper, we propose a two-stage phone recognition system using articulatory and spectral features. In the first stage, articulatory features are predicted from spectral features using FeedForward Neural Networks (FFNNs). In the second stage, phone recognition is carried out using the predicted articulatory features and spectral features together. FFNNs and Hidden Markov Models are explored for...

chapter

Improvement of phone recognition accuracy using source and system features

K E Manjunath, K. Sreenivasa Rao, M Gurunath Reddy

2015 International Conference on Signal Processing and Communication Engineering Systems > 501 - 505

2015 International Conference on Signal Processing And Communication Engineering Systems (SPACES)

The goal of this work is to improve phone recognition accuracy using combination of source and system features. As speech is produced by exciting time varying vocal tract system with time varying excitation, we want to explore both source and system components of speech production system for phone recognition. The excitation source information is derived by processing linear prediction residual of...

chapter

Extracting situational awareness from microblogs during disaster events

Anirban Sen, Koustav Rudra, Saptarshi Ghosh

2015 7th International Conference on Communication Systems and Networks (COMSNETS) > 1 - 6

2015 7th International Conference on Communication Systems and Networks (COMSNETS)

Microblogging sites such as Twitter and Weibo are increasingly being used to enhance situational awareness during various natural and man-made disaster events such as floods, earthquakes, and bomb blasts. During any such event, thousands of microblogs (tweets) are posted in short intervals of time. Typically, only a small fraction of these tweets contribute to situational awareness, while the majority...

chapter

An automatic input protocol recommendation method for tailored switch-to-speech communication aid systems

Fuming Fang, Takahiro Shinozaki, Takao Kobayashi

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 7

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

A switch-to-speech interface can provide a means of interactive communication as a support system for people with disabilities with voluntary movements. Any motion of a part of the body, such as eye movements, can be used for the switch input. The number of possible switch operations varies from person to person, but the bandwidth is generally quite limited. Therefore, efficient input protocols are...

chapter

An inter-speaker evaluation through simulation of electrolarynx control based on statistical F₀ prediction

Kou Tanaka, Tomoki Toda, Graham Neubig, Sakriani Sakti, more

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 4

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

An electrolarynx is a device that artificially generates excitation sounds to produce electrolaryngeal (EL) speech. Although proficient laryngectomees can produce intelligible EL speech by using this device, it sounds quite unnatural due to the mechanical excitation. To address this issue, we have proposed several EL speech enhancement methods using statistical voice conversion and showed that statistical...

chapter

Unsupervised speaker adaptation of DNN-HMM by selecting similar speakers for lecture transcription

Masato Mimura, Tatsuya Kawahara

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 4

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Unsupervised speaker adaptation of Deep Neural Network (DNN) is investigated for lecture transcription tasks, in which a single speaker gives a long speech and thus speaker adaptation is important. The proposed method selects similar speakers to the test data (test speaker) from the training database, which are used for retraining the baseline DNN. Several speaker characteristic features are defined...

chapter

Speech and music classification using hybrid Form of spectrogram and fourier transformation

Piyawat Neammalai, Suphakant Phimoltares, Chidchanok Lursinsap

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 6

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

This paper presents the technique for feature extraction to classify speech and music audio data. The combination of image processing and signal processing is used to classify audio data. There are three main steps. First, the audio data is segments and transformed to spectrogram image and then apply image processing methods to find the salient characteristics on the spectrogram image. The next step...

chapter

Enhancement of EMG-based Thai number words classification using frame-based time domain features with stacking filter

Niyawadee Srisuwan, Michael Wand, Matthias Janke, Pornchai Phukpattaranont, more

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 6

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In order to overcome a problem existing in a classical automatic speech recognition (e.g. ambient noise and loss of privacy), Electromyography (EMG) from speech production muscles was used in place of a human speech signal. We aim to investigate the EMG speech recognition based on Thai language. The earlier work, we used five channels of the EMG from the facial and neck muscles to classify 11 Thai...

chapter

Estimation of Japanese DRT intelligibility using Articulation Index Band Correlations

Kazuhiro Kondo

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 6

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

We proposed and evaluated an estimation method for the forced selection Japanese Diagnostic Rhyme Test (DRT). The proposed measure takes into account the forced selection manner of the DRT from a pair of rhyming words. The objective distance measure used here was based on the Articulation index Band Correlation (ABC), which showed favorable results for the English Modified Rhyme Test (MRT). The correlation...

chapter

Classification of emotions from speech using implicit features

Mohit Srivastava, Anupam Agarwal

2014 9th International Conference on Industrial and Information Systems (ICIIS) > 1 - 6

2014 9th International Conference on Industrial and Information Systems (ICIIS)

Human computer interaction with the time has extended its branches to many different other fields like engineering, cognition, medical etc. Speech analysis has also become an important area of concern. People involved are using this mode for the interaction with the machines to bridge the gap between physical and digital world. Speech emotion recognition has become an integral subfield in the domain...

Keywords:
ACCURACY
SPEECH

Publication date

Set your own date range

Content availability

Available (958)
None (12)

Keywords

SPEECH RECOGNITION (465)
FEATURE EXTRACTION (332)
HIDDEN MARKOV MODELS (261)
TRAINING (241)
SPEECH PROCESSING (186)
ACOUSTICS (159)
DATABASES (139)
MEL FREQUENCY CEPSTRAL COEFFICIENT (132)
SUPPORT VECTOR MACHINES (119)
NOISE (98)
SPEAKER RECOGNITION (89)
DATA MINING (85)
NATURAL LANGUAGE PROCESSING (83)
EMOTION RECOGNITION (76)
ARTIFICIAL NEURAL NETWORKS (62)
ESTIMATION (60)
CLASSIFICATION ALGORITHMS (57)
SIGNAL TO NOISE RATIO (53)
AUTOMATIC SPEECH RECOGNITION (51)
COMPUTATIONAL MODELING (48)
VECTORS (48)
CORRELATION (45)
NOISE MEASUREMENT (44)
HUMANS (40)
CEPSTRAL ANALYSIS (39)
EDUCATIONAL INSTITUTIONS (39)
ALGORITHM DESIGN AND ANALYSIS (38)
MATHEMATICAL MODEL (38)
PATTERN CLASSIFICATION (38)
SIGNAL PROCESSING (38)
SPEAKER IDENTIFICATION (37)
LEARNING (ARTIFICIAL INTELLIGENCE) (36)
ROBUSTNESS (36)
TAGGING (36)
TESTING (36)
DECODING (35)
SPEECH CODING (35)
TRAINING DATA (35)
COMPUTERS (34)
GAUSSIAN PROCESSES (34)
MFCC (34)
ADAPTATION MODEL (33)
DATA MODELS (33)
CONFERENCES (32)
CONTEXT (32)
SPEECH SYNTHESIS (32)
HIDDEN MARKOV MODEL (31)
SPEECH ENHANCEMENT (31)
KERNEL (29)
MICROPHONES (29)
VISUALIZATION (29)
DICTIONARIES (27)
INDEXES (27)
SUPPORT VECTOR MACHINE (27)
TRANSFORMS (27)
EQUATIONS (25)
SIGNAL PROCESSING ALGORITHMS (25)
TEXT ANALYSIS (25)
AUDIO SIGNAL PROCESSING (24)
GMM (24)
SVM (24)
VOCABULARY (24)
CLASSIFICATION (23)
ENTROPY (23)
GAUSSIAN MIXTURE MODEL (23)
MACHINE LEARNING (23)
PRINCIPAL COMPONENT ANALYSIS (23)
STATISTICAL ANALYSIS (23)
ACOUSTIC SIGNAL PROCESSING (22)
OPTIMIZATION (22)
PROBABILITY (22)
SPEECH ANALYSIS (21)
ANALYTICAL MODELS (20)
COMPLEXITY THEORY (20)
SIGNAL CLASSIFICATION (20)
TIME FREQUENCY ANALYSIS (20)
DECISION TREES (19)
DELAY (19)
HMM (19)
MAXIMUM LIKELIHOOD ESTIMATION (19)
PATTERN RECOGNITION (19)
SEMANTICS (19)
SUPPORT VECTOR MACHINE CLASSIFICATION (19)
ELECTRONIC MAIL (18)
ERROR ANALYSIS (18)
LABELING (18)
REAL TIME SYSTEMS (18)
ROBOTS (18)
ADAPTATION MODELS (17)
FACE (17)
FILTERING (17)
HARMONIC ANALYSIS (17)
MUSIC (17)
NEURAL NETWORKS (17)
STRESS (17)
DETECTORS (16)
INFORMATION RETRIEVAL (16)
NATURAL LANGUAGES (16)
more

INFONA - science communication portal

Search results

Sequential forward feature selection with low computational cost

Error handling in multimodal biometric systems using reliability measures

Combining POS taggers in master-slaves technique for highly inflected languages as Arabic

The effects of listening agent in speech-based on-line test system

Voice pathology detection with MDVP parameters using Arabic voice pathology database

An efficient visualized clustering approach (VCA) for various datasets

Data-driven pause prediction for speech synthesis in storytelling style speech

Automatic gender classification using the mel frequency cepstrum of neutral and whispered speech: A comparative study

Speaker based Language Independent Isolated Speech Recognition System

A unique approach in text independent speaker recognition using MFCC feature sets and probabilistic neural network

Two-stage phone recognition system using articulatory and spectral features

Improvement of phone recognition accuracy using source and system features

Extracting situational awareness from microblogs during disaster events

An automatic input protocol recommendation method for tailored switch-to-speech communication aid systems

An inter-speaker evaluation through simulation of electrolarynx control based on statistical F₀ prediction

Unsupervised speaker adaptation of DNN-HMM by selecting similar speakers for lecture transcription

Speech and music classification using hybrid Form of spectrogram and fourier transformation

Enhancement of EMG-based Thai number words classification using frame-based time domain features with stacking filter

Estimation of Japanese DRT intelligibility using Articulation Index Band Correlations

Classification of emotions from speech using implicit features

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options