Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
Automatic voiceprint recognition, posited on human speech signal, serves many salient practical applications. A number of studies are undertaken on the basis of normal speech. This research intends to develop automatic voiceprint recognition system on the basis of emotion speech signal in Indonesia language. The study is limited to four different people with speeches of four distinctive emotional...
In speech recognition system, an improved multi-base neural network speech recognition model is proposed to solve the problem of long learning time and slow convergence rate of deep neural network. However, the improved model introduces a large number of parameters in the training process to make the model over-fitted in the test set, resulting in the deterioration of generalization ability and the...
Emotional decoding ability has been repetitively shown to be impaired in alcoholic patients. The present study aims to extend previous findings on emotions deficits examining the auditory stimuli recognition ability in alcoholism. Twenty-six alcohol-dependent patients, abstinent from alcohol for at least four weeks, were compared to 26 controls matched for sex, age and socioeconomic level. Subjects...
This paper presents a review on few notable speech recognition models that are reported in the last decade. Firstly, the models are categorized into sparse models, learning models and domain - specific models. Subsequently, the characteristics of the models have been observed using speech constraints, algorithmic constraints and performance constraints. The performance of these models reported in...
This paper suggested a technique based on MFCC analysis for audio signals with speech classification application. The proposed work used multi-resolution (wavelet) analysis and spectral analysis based features for feature extraction. The proposed approach uses a no. of features like Mel Frequency Cepstral Coefficient (MFCC), and FFT Coefficients combined with wavelet based features. In addition, accuracy...
This paper aimed at introducing a completely automated Arabic phone recognition system based on Enhanced Wavelet Packets Best Tree Encoding (EWPBTE) 15-point speech feature. The process of enhancing of WPBTE is provided by adding energy component to WPBTE, which is implemented in Matlab software and makes an enhancement of 65 % to recognizer accuracy which is the most contribution in this paper. EWPBTE...
Automatic recognition of emotions in speech has attracted the attention of the research community in recent years. Some of the most relevant proposed applications of it are in call-centers. In these scenarios the speech is distorted by compression algorithms. The effects of such distortion on the performance of systems for automatic recognition of emotions must be assessed. In this study these effects...
The paper deals with the problem of improving speech recognition by combining outputs of several different recognizers. We are presenting our results obtained by experimenting with different classification methods which are suitable to combine outputs of different speech recognizers. Methods which were evaluated are: k-Nearest neighbors (KNN), Linear Discriminant Analysis (LDA), Quadratic Discriminant...
This paper deals with live subtitling of TV ice-hockey commentaries using automatic speech recognition technology. Two methods are presented - a direct transcription of a TV program and a re-speaking approach. Practical issues emerging from the real subtitling system are introduced and their solutions are proposed. Acoustic and language modelling is described as well as modifications of existing live...
In low resource Automatic Speech Recognition (ASR), one usually resorts to the Statistical Machine Translation (SMT) technique to learn transform rules to refine grapheme lexicon. To do this, we face two challenges. One is to generate grapheme sequences from the training data as the targets, which is paired with the original transcripts to train SMT models; the other is to effectively prune the learned...
This paper introduces a new back-end classifier for a speech recognition system that is based on artificial life (ALife). The ALife species being used for classification purposes are called wains, which were developed using the Créatúr framework. The speech recognition task used in the evaluation of the new classifier is that of isolated digit recognition. Performance of the proposed back-end classifier...
There are many popular algorithms to recognize the human voice. The good algorithm not only results the high recognition accuracy, but also robust to noises. Several experiments are done in this research to verify the performance of the Neuro-fuzzy system to recognize the human voice. Eight words in Thai language recorded in a different environment, syllable and pronunciations are used as a data set...
This paper presents the experiments on feature selection for emotional speech classification. There are 152 features used in this experiment. The minimum redundancy maximum relevance (mRMR) feature selection is applied as the features selection. The experiments are constructed from two corpora; Interactive Emotional Dyadic Motion Capture (IEMOCAP) and Emotional Tagged Corpus on Lakorn (EMOLA) which...
Vietnamese is a syllable-based tonal language where the tone used in syllable pronunciation carries important information about the meaning. In this paper, we investigate several approaches how to incorporate the tone into an acoustic model. We propose 3 basic strategies: a) a phoneme-based, b) a vowel-based, and c) a rhyme-based one. Each can be modified so that we obtain 15 different schemes that...
In this paper, microcontroller based automatic door open system has been developed. The system is developed as speech recognition circuit where programmable voice is used as reference. Programmable means trained voice has been used for identification of authorized and unauthorized person. As software part, MATLAB GUI interface has been used to record authorized voice, and to synthesize the recorded...
People who have lost their walking and moving ability need to use a wheelchair. In cases of losing complete control of the upper and lower limbs, intelligent solutions are required to ensure the autonomy and independence of those patients. The intelligent application must be designed carefully to use the available self-controlled electrical and physical activity of the patient's body like sound, Electromyogram...
In speech development research, it's important to know how speech acoustic features vary as a function of age and the age when the variability and magnitude of acoustic features start to exhibit adult-like patterns. During the first few years of life, a child's speech changes from the cries and babbles of an infant to adult-like words and phrases of a young child. A number of acoustic studies observed...
Speech recognition is widely applied to speech to text, speech to emotion, in order to make gadget and computer easier to use, or to help people with hearing disability. Feature extraction is one of significant step in the performance of speech recognition. Therefore, the proper selection is really needed. In this paper, we analyze feature extraction that can have good performance for Indonesian speech...
An energy-efficient speech extraction (SE) processor is proposed for the robust speech recognition in the head-mounted display (HMD) systems. Speech extraction is essential for robust speech recognition in noisy environment. For the low-latency speech extraction, FastSE is proposed to overcome 50x larger complex cICA-based selection process which results in <2ms SE latency. Moreover, a reinforced-FastSE...
Parallel Phoneme Recognition followed by Language Modelling (PPRLM) systems currently provide state of the art language identification performance on conversational telephone speech. In this paper an innovative method for tonal and non-tonal language pre-classification by using prosodie information is reported. Our motivation is to improve recognition accuracy and save the amount of CPU run-time while...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.