The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper we propose a Round Trip Translation (RTT) based approach to sentence-level confidence estimation (CE) for spoken language translation without the assistant of reference translations generated by human. A number of novel RTT based features are introduced to reflect the quality of spoken language translation in more detail. After combing various kinds of features together, support vector...
Automatic prosodic break detection is important for both speech understanding and natural speech synthesis. In this paper, we develop complementary model to detect Mandarin prosodic break by using acoustic, lexical and syntactic evidence. The model realizes the complementarities by taking the advantages of each model. When comparing with the baseline system, our proposed method has good performance.
We consider the problem of similar Chinese character recognition in this paper. Engaging the Average Symmetric Uncertainty (ASU) criterion to measure the correlation between different image regions and the class label, we manage to detect the most critical regions for each pair of similar characters. These critical regions are proved to contain more discriminative information and hence can largely...
This paper presents experiments using several vector space models in Automated Essay Scoring (AES). Firstly, we compare four different Vector Space Models (VSM) which are the Word-based Vector Space Model (W-VSM), the Weight Adapted Word-based Vector Space Model (WAW-VSM), the Latent Semantic-based Vector Space Model (LS-VSM) and the Sequence Latent Semantic-based Vector Space Model (SLS-VSM). The...
This paper addresses the ongoing issue of tone error detection for Mandarin Computer Assisted Language Learning (CALL) systems. A novel approach based on clustering is proposed. The selection of different contextual tonal factors including Uni-tone, LBi-tone and RBi-tone are explored. Experimental results show that our proposed approach is feasible, obtaining an Equal Error Rate (EER) of 18.75% by...
Automatic stress detection is important for both speech understanding and natural speech synthesis. In this paper, we develop hierarchical model based boosting classification and regression tree (CART) to detect Mandarin stress by using acoustic evidence and text information. When comparing with previous proposed method at the same training and test sets, there are 2.52% and 1.09% absolute accuracy...
In this paper, we present the work in progress on automatic detection of stress in continuous Mandarin (standard Chinese) spoken utterance, and we are interested in finding the characteristic and performance of the acoustic stress cues in Mandarin. Therefore, correlated stress features including pitch, duration, intensity and spectral intensity are exploited with the purpose of developing the baseline...
In this study, we combine the Mandarin characteristics with Mandarin acoustic attribute and text information and use hierarchical model based ensemble machine learning to predict Mandarin pitch accent. Our model could make the best of advantages of prosody hierarchical structure and ensemble machine learning. When comparing our model with classification and regression tree (CART), support vector machine...
Automatic assessment of word stress error is an integral part for oral language grading system. However, problems that the property of vowels depends on its context information and the data sparseness of different vowel class are yet to be solved. This paper shall briefly introduce a hybrid method consisting of both traditional prosodic features and proposed context dependent strategies. In classification...
Prosody is an important factor for a high quality text-to- speech (TTS) system. Prosody is often described with a hierarchical structure. So the generation of the hierarchical prosody structure is very important both in the corpus building and the real-time text analysis, but the prosody labeling procedure is laborious and time consuming. In this paper, an automatic prosody boundary label system is...
Many features have been proposed to evaluate examineespsila language proficiency. However, few of them are semantic based. In this paper, a novel feature for semantic scoring is presented. It is designed for a typical question type in language tests, namely reading-answering-problem. The proposed feature extraction process involves several operations: transcribing the speech data, automatically tagging...
Recently a new language model, the random forest language model (RFLM), has been proposed and shown encouraging results in speech recognition tasks. In this paper we applied the RFLM to language identification tasks. We proposed a shared backoff smoothing to deal with data sparseness problem. Experiments were conducted on a subset of NIST 2003 language recognition evaluation data. The RFLM obtained...
In this paper, we present a novel keyword spotting (KWS) method derived from traditional acoustic KWS. The advantage of this method is that it doesn't need any manually transcribed data to train the acoustic model, so it can be deployed fast for KWS task dealing with small languages and dialectal speech, which the traditional KWS systems can't handle because of the lack of training data. A prototype...
We present a statistical machine translation system which uses hierarchical chunking phrases (HCPB). The system can be seen as combination with fundamental ideas from both syntax-based translation and phrase-based translation, because the model not only complies with formal synchronous context-free grammar but also learned partial parsing knowledge CRF (conditional random fields) method. The decoder...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.