The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Functional transfer matrices consist of real functions with trainable parameters. In this work, functional transfer matrices are used to model functional connections in neural networks. Different from linear connections in conventional weight matrices, the functional connections can represent nonlinear relations between two neighbouring layers. Neural networks with the functional connections, which...
Recently, bottleneck features as effective representations have been successfully used in Speaker Recognition (SR) and Language Recognition (LR), but little work has focused on bottleneck features for Bird Species Verification (BSV). In SR, LR and BSR tasks, using short-time spectra features may be insufficient, so it need some more abstract and discriminative representations as complementation to...
Symbolic reasoning is difficult for neural networks. Especially, reasoning with variables can be a challenging task for them. In this paper, a symbolic reasoning method based on deep neural networks is proposed, and this method is applied to axiom discovery. This method makes use of the concept of “symbolic manipulation”. Specifically, it relies on the learning ability of the deep neural networks...
In this paper, we propose Multi-state Activation Functions (MSAFs) for Deep Neural Networks (DNNs). These multi-state functions do extra classification based on the 2-state Logistic function. Discussions on the MSAFs reveal that these activation functions have potentials for altering the parameter distribution of the DNN models, improving model performances and reducing model sizes. Meanwhile, an...
Finding out an effective way to score Chinese written essays automatically remains challenging for researchers. Several methods have been proposed and developed but limited in the character and word usage levels. As one of the scoring standards, however, content or topic perspective is also an important and necessary indicator to assess an essay. Therefore, in this paper, we propose a novel perspective...
Punctuation recovery is very important for automatic speech recognition (ASR). It greatly improves readability of transcripts and user experience, and facilitates following natural language processing tasks. The text information based method is one of the basic solutions of punctuation recovery. For analyzing the features of these algorithms, improving them and using them to develop practical system,...
Chinese text error detection and correction is widely applicable, but the methods so far are not robust enough for industrial use. In this paper, a new method is proposed based on Tri-gram modeled-Weighted Finite-State Transducer (WFST). By integrating confusing-character table, beam search and A* search, we evaluate the performance on real test essays. Various experiments have been conducted to prove...
In this paper, we first review several approaches of feature extraction algorithms in robust speech recognition, e.g. Mel frequency cepstral coefficients (MFCC) [1], perceptual linear prediction (PLP) [2] and power-normalized cepstral coefficients (PNCC) [3]. A new feature extraction algorithm for noise robust speech recognition is proposed, in which medium-time processing works as noise suppression...
This paper presents experiments using several vector space models in Automated Essay Scoring (AES). Firstly, we compare four different Vector Space Models (VSM) which are the Word-based Vector Space Model (W-VSM), the Weight Adapted Word-based Vector Space Model (WAW-VSM), the Latent Semantic-based Vector Space Model (LS-VSM) and the Sequence Latent Semantic-based Vector Space Model (SLS-VSM). The...
This paper addresses the ongoing issue of tone error detection for Mandarin Computer Assisted Language Learning (CALL) systems. A novel approach based on clustering is proposed. The selection of different contextual tonal factors including Uni-tone, LBi-tone and RBi-tone are explored. Experimental results show that our proposed approach is feasible, obtaining an Equal Error Rate (EER) of 18.75% by...
In this paper, we present the work in progress on automatic detection of stress in continuous Mandarin (standard Chinese) spoken utterance, and we are interested in finding the characteristic and performance of the acoustic stress cues in Mandarin. Therefore, correlated stress features including pitch, duration, intensity and spectral intensity are exploited with the purpose of developing the baseline...
Intonation assessment is an important part of Chinese CALL system. Nowadays, most systems use the correlation and RMSE features to assess the quality of the intonation of a given speech. As correlation and RMSE assign unoptimized weights to different degrees of mismatching errors, they may lead to performance degradation. In this paper, we propose a new feature called sorted error vector (SEV) for...
F0 is an important tone features in the state-of-art tone recognition system. Traditionally, difference of F0 (F0), subsection slope and intercept, and subsection mean F0 and mean F0, are used to improve the recognition accuracy. In fact, all these features can be expressed as the linear transform of F0. The problem is to find the best coefficients for the transform. Linear discriminant analysis (LDA)...
This paper presents an effective method for automatic pronunciation evaluation, which is based on feature extraction and combination. The proposed system extracts different kinds of evaluation features and combines them to produce an ultimate machine score, which predicts the overall pronunciation quality of a student. Experiments on a reading speech database show that most of the selected features...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.