The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Proxy-word based out of vocabulary (OOV) keyword search has been proven to be quite effective in keyword search. In proxy-word based OOV keyword search, each OOV keyword is assigned several proxies and detections of the proxies are regarded as detections of the OOV keywords. However, the confidence scores of these detections are still those of the proxies from lattices. To obtain a better confidence...
Computational auditory scene analysis (CASA) system is well used in speech enhancement area in recent years. We propose a new system that combines CASA and spectral subtraction to get better enhanced speech. The CASA part consists of the latest method deep neural networks (DNNs). The original way to reconstruct the denoise signal is to use the estimated masks with direct overlap-add method ignoring...
The National Digital Switching System Engineering and Technological R&D Center (NDSC) speech-to-text transcription system for the 2016 multi-genre broadcast challenge is described. Various acoustic models based on deep neural network (DNN), such as hybrid DNN, long short term memory recurrent neural network (LSTM RNN), and time delay neural network (TDNN), are trained. The system also makes use...
Speech separation based on deep neural networks (DNNs) has been widely studied recently, and has achieved considerable success. However, previous studies are mostly based on fully-connected neural networks. In order to capture the local information of speech signals, we propose to use convolutional maxout neural networks (CMNNs) to separate speech and noise by estimating the ideal ratio mask of the...
The Context-Dependent Deep-Neural-Network HMM, or CD-DNN-HMM, is a powerful acoustic modeling technique. Its training process typically involves unsupervised pre-training and supervised fine-tuning. In the paper, we demonstrate that the performance of DNNs can be improved by utilizing a large amount of unlabeled data in the training procedure. In our method, CD-DNN-HMM trained using 309 hours of unlabeled...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.