Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
Neural machine translation (NMT) has shown promising results and rapidly gained adoption in many large-scale settings. With the NMT model being widely used in empirical productions, its long-standing weakness in handling the rare and out of vocabulary words has been amplified a lot. In order to release the model from the stress of “understanding” the rare words, copy mechanism has been proposed to...
In this paper, we target improving the accuracy of acoustic modelling for statistical parametric speech synthesis (SPSS) and introduce the convolutional neural network (CNN) due to its powerful capacity in locality modelling. A novel model architecture combining unidirectional long short-term memory (LSTM) and a time-domain convolutional output layer (COL) is proposed and employed to acoustic modelling...
In automatic speech recognition (ASR), connectionist temporal classification (CTC) is regarded as a method to achieve end-to-end system. Actually, not only characters (Chars) but also context independent phonemes (CI-Phns) or context dependent phoneme (CD-Phns) can be used as output units of CTC-trained neural network. The contribution of this paper mainly lies in three aspects: First, we trained...
Word segmentation is the first step in Chinese natural language processing, and the error caused by word segmentation can be transmitted to the whole system. In order to reduce the impact of word segmentation and improve the overall performance of Chinese short text classification system, we propose a hybrid model of character-level and word-level features based on recurrent neural network (RNN) with...
In this paper, we propose a convolutional framework for short texts expansion and classification. Particularly, by using additive composition over word embeddings from context with variable window width, the representations of multi-scale semantic units are computed first. Empirically, the semantically related words are usually close to each other in embedding spaces. Thus, the restricted nearest...
There are several papers about pseudo dynamic methods used in signature authentication. Recently, the gray scale features local binary pattern(LBP) originate from texture analysis has been widely used in signature verification system with advantage of robustness to illumination change. The major problem of LBP is its sensitivity to noise, hence many solutions has been applied to solve this problem...
In this paper, we propose an unsupervised phrase-based data selection model, address the problem of selecting no-domain-specific language model (LM) training data to build adapted LM for use. In spoken language translation (SLT) system, we aim at finding the LM training sentences which are similar to the translation task. Compared with the traditional bag-of-words models, the phrase-based data selection...
Hierarchical phrase-based (HPB) translation has been introduced to speech-to-speech (S2S) translation system on mobile terminals, such as smartphones. However, it suffers from the explosive growth in the number of rules along with the increment in decoding time for S2S translation system when the memory and decoding speed is restricted. In this paper, we propose a nesting HPB model to capture the...
In this paper, we present a platform for aiding in the development and evaluation of novel ITS passive safety applications. Such applications work by having vehicles detect certain events that may be dangerous to other vehicles and disseminating reports about these events using wireless communication. A vehicle receiving the report about the event can then be warned. However, a large number of false...
In this paper, we propose a discriminative method for the acoustic feature based language recognizer, which is a modification of the polynomial expansion in generalized linear discriminant sequence (GLDS) kernel. It is inspired by the Gaussian mixture model-support vector machine (GMM-SVM) system which has been successfully used in both speaker and language recognition. Because of the restriction...
This paper presents our preliminary works on exploring unsupervised training of subspace gaussian mixture models for under-resourced CTS recognition task. The subspace model yields better performance than conventional GMM model, particularly in small or middle-sized training set. As an effective way to save human efforts, unsupervised learning is often applied to automatically transcribe a large amount...
In the task of mispronunciation detection, the cross-speaker degradation and some other confusing nuisances are the challenging problems demanding prompt solution. In this paper, we will attempt to remove the non-pronunciation variations in the GLDS-SVM expansion space by using nuisance attribute projection strategy, in order to increase the separating capacity between different phoneme instances...
In this paper we propose a Round Trip Translation (RTT) based approach to sentence-level confidence estimation (CE) for spoken language translation without the assistant of reference translations generated by human. A number of novel RTT based features are introduced to reflect the quality of spoken language translation in more detail. After combing various kinds of features together, support vector...
We consider the problem of similar Chinese character recognition in this paper. Engaging the Average Symmetric Uncertainty (ASU) criterion to measure the correlation between different image regions and the class label, we manage to detect the most critical regions for each pair of similar characters. These critical regions are proved to contain more discriminative information and hence can largely...
This paper outlines the first Asian network-based speech-to-speech translation system developed by the Asian Speech Translation Advanced Research (A-STAR) consortium. The system was designed to translate common spoken utterances of travel conversations from a certain source language into multiple target languages in order to facilitate multiparty travel conversations between people speaking different...
Experimental teaching is widely applied in the fields of natural science, engineering, ecology and social science etc. It is also witnessed the increasingly penetrating into business related teaching and training. In particular, simulation based experiment has got immense attention in many management and economics schools of the universities. Simulation based experimental teaching helps the students...
Joint factor analysis (JFA) has become the state-of-the-art technique in the problem of speaker verification. At the same time, the training of eigenvoice matrix seems to be a heavy burden to us, because it requires lots of multi-channel data, which largely determines the performance of the system. In this paper, we first try to exploit an upper bound performance of the JFA system in a non-normal...
In this study, we combine the Mandarin characteristics with Mandarin acoustic attribute and text information and use hierarchical model based ensemble machine learning to predict Mandarin pitch accent. Our model could make the best of advantages of prosody hierarchical structure and ensemble machine learning. When comparing our model with classification and regression tree (CART), support vector machine...
This paper investigates the combined feedforward and decision-feedback equalizer (DFE) in optical burst-mode receivers. The main focus is on the challenge of using DFE to improve bite-error rate (BER) performance for burst-mode receiver.
Automatic assessment of word stress error is an integral part for oral language grading system. However, problems that the property of vowels depends on its context information and the data sparseness of different vowel class are yet to be solved. This paper shall briefly introduce a hybrid method consisting of both traditional prosodic features and proposed context dependent strategies. In classification...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.