The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we propose a convolutional framework for short texts expansion and classification. Particularly, by using additive composition over word embeddings from context with variable window width, the representations of multi-scale semantic units are computed first. Empirically, the semantically related words are usually close to each other in embedding spaces. Thus, the restricted nearest...
There are several papers about pseudo dynamic methods used in signature authentication. Recently, the gray scale features local binary pattern(LBP) originate from texture analysis has been widely used in signature verification system with advantage of robustness to illumination change. The major problem of LBP is its sensitivity to noise, hence many solutions has been applied to solve this problem...
Automatic TV commercial block detection is a key component of an intelligent commercial management system. Rather than utilizing exclusively audio-visual characteristics like most previous works, We have proposed a SVM-DP scheme to collaboratively exploit audio-visual and global temporal characteristics associated with commercials. Firstly, likelihood values of commercial and general program are calculated...
In this paper, we first review several approaches of feature extraction algorithms in robust speech recognition, e.g. Mel frequency cepstral coefficients (MFCC) [1], perceptual linear prediction (PLP) [2] and power-normalized cepstral coefficients (PNCC) [3]. A new feature extraction algorithm for noise robust speech recognition is proposed, in which medium-time processing works as noise suppression...
Efficient and robust retrieval of commercial videos is an important topic for many applications such as commercial monitoring, market investigation. In this paper, we propose a two-step scheme to optimally incorporate the information of both visual and audio modalities into commercial retrieval. Firstly an efficient search method based on the extracted audio fingerprinting feature is proposed to yield...
Automatic scene detection is a fundamental step for efficient video searching and browsing. This paper presents our current work on scene detection that integrates three effective strategies into a single framework. For each video, firstly, a coherence signal is constructed by graph modal obtained from the similarity matrix in a temporal interval. Secondly, the signal is optimized by scene transition...
With the fast development of high-speed network and digital video recording technologies, broadcast video has been playing a more and more important role in our daily life. In this paper, we propose a novel news story segmentation scheme which can segment broadcast video into story units with multi-modal information fusion (MMIF) strategy. Compared with traditional methods, the proposed scheme extracts...
In this paper we propose a Round Trip Translation (RTT) based approach to sentence-level confidence estimation (CE) for spoken language translation without the assistant of reference translations generated by human. A number of novel RTT based features are introduced to reflect the quality of spoken language translation in more detail. After combing various kinds of features together, support vector...
We consider the problem of similar Chinese character recognition in this paper. Engaging the Average Symmetric Uncertainty (ASU) criterion to measure the correlation between different image regions and the class label, we manage to detect the most critical regions for each pair of similar characters. These critical regions are proved to contain more discriminative information and hence can largely...
This paper presents experiments using several vector space models in Automated Essay Scoring (AES). Firstly, we compare four different Vector Space Models (VSM) which are the Word-based Vector Space Model (W-VSM), the Weight Adapted Word-based Vector Space Model (WAW-VSM), the Latent Semantic-based Vector Space Model (LS-VSM) and the Sequence Latent Semantic-based Vector Space Model (SLS-VSM). The...
This paper addresses the ongoing issue of tone error detection for Mandarin Computer Assisted Language Learning (CALL) systems. A novel approach based on clustering is proposed. The selection of different contextual tonal factors including Uni-tone, LBi-tone and RBi-tone are explored. Experimental results show that our proposed approach is feasible, obtaining an Equal Error Rate (EER) of 18.75% by...
In this paper, a novel image forensics method is proposed to detect manual blurred edges from a tampered image. Firstly, the image edges are analyzed by using non-subsampled contourlet transform. Then the differences between the normal edge and the blurred edge are extracted by researching phase congruency and prediction-error image. After that, the features are used to train the SVM, by which the...
In this paper, we present the work in progress on automatic detection of stress in continuous Mandarin (standard Chinese) spoken utterance, and we are interested in finding the characteristic and performance of the acoustic stress cues in Mandarin. Therefore, correlated stress features including pitch, duration, intensity and spectral intensity are exploited with the purpose of developing the baseline...
This paper focuses on setting up a question-answering oriented biomedical domain, and it applies several different approaches to the different processing phases. Firstly, it uses shallow parser to identify the types of questions and extract the keywords, and the keywords are expanded with UMLS for the purpose of improving the recall. Secondly, passage retrieval is performed with the expanded keywords...
Automatic assessment of word stress error is an integral part for oral language grading system. However, problems that the property of vowels depends on its context information and the data sparseness of different vowel class are yet to be solved. This paper shall briefly introduce a hybrid method consisting of both traditional prosodic features and proposed context dependent strategies. In classification...
Intonation assessment is an important part of Chinese CALL system. Nowadays, most systems use the correlation and RMSE features to assess the quality of the intonation of a given speech. As correlation and RMSE assign unoptimized weights to different degrees of mismatching errors, they may lead to performance degradation. In this paper, we propose a new feature called sorted error vector (SEV) for...
Mispronunciation detection is an important component in computer assisted language learning (CALL) system. In this work, we introduce an efficient GLDS-SVM based detection method, which is successfully used in language and speaker identification systems, and combine it with traditional methods. The main ideas include: extended MFCC features with normalized formant trajectory information, and then...
Mispronunciation detection is one of the vital tasks of the CALL (Computer Assisted Language Learning) systems. Many methods have been introduced to accomplish this task. However, few of them have addressed the detection task on confusable phones. In this paper, phone-level classifiers are utilized to improve the detection performance on the confusable phones. Features of the classifiers are posterior...
Although researchers have made great progresses on music genre classification in recent years, the need for more accurate system is still not satisfied. In this paper, we propose a method for further reducing the classification error rate based on multiple classifier fusion. First of all, MFCCs and four features from MPEG-7 audio descriptor are extracted in every short time frame, and then a group...
Many features have been proposed to evaluate examineespsila language proficiency. However, few of them are semantic based. In this paper, a novel feature for semantic scoring is presented. It is designed for a typical question type in language tests, namely reading-answering-problem. The proposed feature extraction process involves several operations: transcribing the speech data, automatically tagging...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.