The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
A multi-pitch determination algorithm based on mixture Laplacian distribution (MLD) is proposed. MLD replaces the autocorrelation function (ACF) of correlogram which shows the possibility of the lag being the pitch period. The peaks of summary MLDs indicate the multiple pitch periods. Compared with summary correlogram, summary MLDs has better resolution and less pseudo peaks which do not correspond...
In this paper we propose a Round Trip Translation (RTT) based approach to sentence-level confidence estimation (CE) for spoken language translation without the assistant of reference translations generated by human. A number of novel RTT based features are introduced to reflect the quality of spoken language translation in more detail. After combing various kinds of features together, support vector...
Automatic stress detection is important for both speech understanding and natural speech synthesis. In this paper, we develop hierarchical model based boosting classification and regression tree (CART) to detect Mandarin stress by using acoustic evidence and text information. When comparing with previous proposed method at the same training and test sets, there are 2.52% and 1.09% absolute accuracy...
Joint factor analysis (JFA) has become the state-of-the-art technique in the problem of speaker verification. At the same time, the training of eigenvoice matrix seems to be a heavy burden to us, because it requires lots of multi-channel data, which largely determines the performance of the system. In this paper, we first try to exploit an upper bound performance of the JFA system in a non-normal...
In this paper, we present the work in progress on automatic detection of stress in continuous Mandarin (standard Chinese) spoken utterance, and we are interested in finding the characteristic and performance of the acoustic stress cues in Mandarin. Therefore, correlated stress features including pitch, duration, intensity and spectral intensity are exploited with the purpose of developing the baseline...
Intonation assessment is an important part of Chinese CALL system. Nowadays, most systems use the correlation and RMSE features to assess the quality of the intonation of a given speech. As correlation and RMSE assign unoptimized weights to different degrees of mismatching errors, they may lead to performance degradation. In this paper, we propose a new feature called sorted error vector (SEV) for...
Mispronunciation detection is one of the vital tasks of the CALL (Computer Assisted Language Learning) systems. Many methods have been introduced to accomplish this task. However, few of them have addressed the detection task on confusable phones. In this paper, phone-level classifiers are utilized to improve the detection performance on the confusable phones. Features of the classifiers are posterior...
Monaural speech segregation is a very challenging problem which has been studied by many researchers. In this paper, we focus on voiced speech segregation. Different strategies are used to segregate resolved and unresolved harmonics respectively. For resolved harmonics, "harmonicity" principle and a novel mechanism based on "minimum amplitude" principle are employed. Amplitude...
This paper presents an effective method for automatic pronunciation evaluation, which is based on feature extraction and combination. The proposed system extracts different kinds of evaluation features and combines them to produce an ultimate machine score, which predicts the overall pronunciation quality of a student. Experiments on a reading speech database show that most of the selected features...
In this paper, harmonics template was proposed to analyze harmonics in co-channel speech. New feature first-peak position of autocorrelation was exploited when we generated and utilized harmonics template. We gain harmonics template by statistics and curve fitting. For co-channel speech, harmonics from different channels were achieved by applying frequency channel piecewise continuity and matching...
Monaural speech separation is one of the most difficult problems in speech signal processing. In this paper, a new method based on machine learning and computational auditory scene analysis (CASA) is suggested to separate the monaural speech of two-talker. The technique of machine learning is used to learn the grouping cues on isolated clean data from single speaker. By using a factorial-max vector...
The task of keyword spotting is to detect a set of keywords in the input continuous speech. The main goal of this work is to develop an improved Mandarin keyword spotting (KWS) system for conversational telephone speech (CTS). In this paper, we propose an efficient online-garbage model based KWS system, which integrated with a word-level minimum classification error (MCE) training method and a novel...
Monaural speech separation is a very challenging problem in speech signal processing. It has been studied previously, and many separation systems based on computational auditory scene analysis (CASA) have been proposed. Although the research on CASA has tended to introduce high level knowledge into separation process from primitive data-driven method, the knowledge of speech quality still has not...
Monaural speech separation is a very challenging problem in speech signal processing. It has been studied extensively, and many separation systems based on computational auditory scene analysis (CASA) have been proposed in the last two decades. Although the research on CASA has tended to introduce high-level knowledge into separation processes using primitive data-driven methods, the knowledge on...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.