The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This article presents a method that uses Linear Prediction Coefficients (LPC) and Mel-Frequency Cepstral Coefficients (MFCC) as features to classify normal and abnormal cardiac sounds. Three different feature vectors were tested: LPC-only, MFCC-only and LPC + MFCC. Different experiments were made with three classifiers: Support Vector Machine (SVM), K-Nearest Neighbor (KNN) and Random Forests, using...
Speaker identification is a field of which usage grows faster in security systems and forensic sciences. Depending on the tasks, online or offline applications are presented. It is an important problem that how much they are accurate, how much they are fast or how hard is its computation. In this study, the accuracy and the speed of the classifiers that can be used on speaker identification and the...
This paper presents a study on how the performance of Phonetic engine(PE) varies with different set of spectral features selected for it. An exclusive study is carried out with a PE developed in the Manipuri language. Here, we built the PE using phonetic transcriptions and modeling of each phonetic unit by Hidden Markov Model (HMM). The symbols of International Phonetic Alphabet (IPA) (revised in...
Physiological and behavioural human characteristics are exploited in biometrics and performance metrics are used to measure some characteristic of an individual. The measure might lead to a one-to-one match, which is called authentication or one-from-N, and a match represents identification. In this paper, we exploit a speech biometric I-vector with low and fixed dimension of 100 to identify speakers...
In this paper, we describe an unsupervised method to segment birdcalls from the background in bioacoustic recordings. The method utilizes information derived from both source features as well as system features. Three types of source features are extracted from the linear prediction residual signal, and Mel frequency cepstral coefficients are extracted from the system features. The source features...
The Automatic Speech Recognition (ASR) systems suffer from many types of noises in different environments. Nowadays, developing robust ASR system is an attractive research topic due to the high demands in many commercial applications. In this paper, the Mel-Frequency Cepstral Coefficients (MFCC) is modified to robust the noise, where the spectrogram is used as time-frequency analysis tool. The proposed...
Automatic Speech Recognition (ASR) System is defined as transformation of acoustic speech signals to string of words. This paper presents an approach of ASR system based on isolated word structure using Mel-Frequency Cepstral Coefficients (MFCC's), Dynamic Time Wrapping (DTW) and K-Nearest Neighbor (KNN) techniques. The Mel-Frequency scale used to capture the significant characteristics of the speech...
Speaker identification is a biometric technique of determining an unknown speaker's identity among a number of speakers using distinguish latent information of uttered speech. Crime investigation, security control, telephone banking and trading, and information reservation are some applications of this technique. Frequency Domain Linear Prediction (FDLP) is a time-frequency-based feature has been...
This study was performed to evaluate the feasibility of short-time energy as an input vector features that will be used as a key of recognition in the voice biometric system to recognize the Cerebral Palsy (CP). To retrieve the characteristics of the voice, Mel-Frequencies Cepstral Coefficients (MFCC) was used as feature extraction algorithm, while Neuro Fuzzy was used as the classifier algorithm...
Present Mel Frequency Cepstral Coefficient (MFCC) based Bangla Automatic Speech Recognition (ASR) systems are mostly implemented with delta and acceleration coefficients. With delta and acceleration coefficients of MFCC and the log energy, a vector set of 39 dimensions is obtained per 10ms. In this paper, our objective is to observe the effect of third differential coefficients on the performance...
In this paper we present a new database with speech recordings in Spanish. The database contains recordings of 54 native Spanish speakers. It is appropriate to be used in the development and testing of better Speaker Verification systems. The recording procedure, equipments and speech tasks are detailed. Experiments using the GMM-UBM speaker verification methodology were performed. The methodology...
Speech enhancement using Kalman filter is an extensively researched area. The vast majority of work done in this area uses linear predictive coding (LPC) for modeling speech signal. A few important studies have revealed the superiority of Mel Frequency Cepstral Coefficients (MFCC) over LPC for speech recognition. With this paper, the shortcomings of speech enhancement using LPC with Kalman filters...
In this paper, we are proposing a personalized music recommender service based on Mamdani Fuzzy Interference System (M-FIS). Collection of playlist is used for gathering users' choice and mood while listening to songs. Similarity between audio files is calculated based on Mel Frequency Cepstral Coefficients (MFCC). We have developed a recommender model based on M-FIS with the aforementioned similarities...
The two major applications of speaker recognition applications are speaker verification and speaker identification. But in most of the cases the signal is corrupted with background interferences such as noise and echo. This paper proposes the method of speaker recognition and identification after the noise separation. Support Vector Machine(SVM) classification based signal separation is adopted here...
Feature extraction is a crucial part for a large number of audio tasks. Researchers have extracted audio features in multiple ways, among which some most recent methods are based on the hidden layer of a trained neutral network. In this paper, we present a system which can automatically extract features from unlabeled audio data, and then the features of extracted from the system are used for audio...
Music signal is a one-dimensional temporal sequence. It thus incurs difficulty for the listeners to quickly capturing the mostly attracting parts in popular songs, unless the listeners play the song until the ending. In order to improve the listening experience, music summarization, a tool to summarize the song using the most attractive sections, is needed. In the paper, a system and method is presented...
This paper presents an FPGA-based real-time acoustic features extraction method based on MFCC (Mel-Frequency Cepstral Coefficients). The proposed system enables automatic audio indexing of broadcast data from the European standard Frequency Modulation (FM) radio band. Using modelbased design approach that reduces overall design time, we successfully implemented it on Virtex 6 FPGA clocked at more...
Speaker verification deals with the task of confirming the identity of a claim using a hypothesized speaker model and a speaker model database. This work concentrates on a speaker verification system by combining GMM and SVM. The feature vectors used for modelling are Mel Frequency Cepstral Coefficients (MFCC). The database is collected through different recording equipments which is considered as...
In Speaker Recognition (SR) system, feature extraction is one of the crucial steps where the particular speaker related information are extracted. The state of the art algorithm for this purpose is Mel Frequency Cepstral Coefficient (MFCC), and its complementary feature, Inverted Mel Frequency Cepstral Coefficient (IMFCC). MFCC is based on mel scale and IMFCC is based on inverted mel (imel) scale...
In Speaker Recognition (SR) system, feature extraction is one of the crucial steps where the particular speaker related information is extracted. The state of the art algorithm for this purpose is Mel Frequency Cepstral Coefficient (MFCC), and its complementary feature, Inverted Mel Frequency Cepstral Coefficient (IMFCC). MFCC is based on mel scale and IMFCC is based on inverted mel (imel) scale....
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.