The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Recently, a hybrid deep neural network/i-vector framework has been proved effective for speaker verification, where the DNN trained to predict tied-triphone states (senones) is used to produce frame alignments for sufficient statistics extraction. In this work, in order to better understand the impact of different phonetic precision to speaker verification tasks, three levels of phonetic granularity...
Proxy-word based out of vocabulary (OOV) keyword search has been proven to be quite effective in keyword search. In proxy-word based OOV keyword search, each OOV keyword is assigned several proxies and detections of the proxies are regarded as detections of the OOV keywords. However, the confidence scores of these detections are still those of the proxies from lattices. To obtain a better confidence...
The National Digital Switching System Engineering and Technological R&D Center (NDSC) speech-to-text transcription system for the 2016 multi-genre broadcast challenge is described. Various acoustic models based on deep neural network (DNN), such as hybrid DNN, long short term memory recurrent neural network (LSTM RNN), and time delay neural network (TDNN), are trained. The system also makes use...
End-to-end speech recognition systems have been successfully implemented and have become competitive replacements for hybrid systems. A common loss function to train end-to-end systems is connectionist temporal classification (CTC). This method maximizes the log likelihood between the feature sequence and the associated transcription sequence. However there are some weaknesses with CTC training. The...
Recurrent neural networks (RNNs) have shown an ability to model temporal dependencies. However the problem of exploding or vanishing gradients has limited their application. In recent years, long short-term memory RNNs (LSTM RNNs) have been proposed to solve this problem, and have achieved excellent results. However, because of the large size of LSTM RNNs, they more easily suffer from overfitting,...
The OpenKWS14 keyword search evaluation is one of the most challenging and influential evaluations in the field of speech recognition. Its goal is to build a high-performance keyword search system for a minority language with limited training data in a short period of time. We present the system of the Department of Electronic Engineering, Tsinghua University (THUEE team) for the OpenKWS14 keyword...
In prosody event detection field, many local acoustic features have been proposed for representing the prosody characteristics of speech unit. The context information that represents some possible regularities underlying neighboring prosody events, however, hasn't been used effectively. The main difficulty to utilize prosodic context is that it's hard to capture the long-distance sequential dependency...
This paper is to design the small-sized intelligent humanoid robot as the foundation, and introduce how to realize the humanoid robot basic function and human-computer interaction and the intelligence function, which is separately from four aspects, i.e. the speech, vision, proximity sense three single intelligent functions, multiple sensor fusion system and gait planning, physical prototype test...
The traditional Finite Element Method (FEM) and Statistical Energy Analysis (SEA) are widely used in predicting structure noise of vehicles. However, these two kinds of methods both have defects in solving the Mid-Frequency problems. In this work, a hybrid FE-SEA vehicle modal is built to analysis the structural-acoustic system over frequency range 200–500 Hz. Experiments are carried out to acquire...
The study of the reflection of ultrasonic waves from a solid-solid interface can be used to measure the thickness between two solid interfaces. This paper describes the theoretical analysis and experimental research of the lubricant film thickness measurement through using normal incidence ultrasonic reflection. The proportion of the incident wave which is reflected at the interface was measured as...
Bonding is an essential step to enclose microchannels or microchambers in lab-on-a-chip. Ultrasonic bonding was studied as a deformation-free technique to realize high efficiency bonding of microfluidic chips. Based on viscoelastic dissipation theory, the main influential factors of heat generation rate during ultrasonic bonding was theoretically analyzed and numerically calculated using finite element...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.