The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, authors describe parameters which may be tuned to obtain the best performance and accuracy for a large vocabulary continuous speech recognition task. Behavior of certain parameters should be similar regardless of the language speech recognition. However, some parameters will have a different impact on the accuracy of the Polish speech recognition as compared to the English speech recognition.
Word n-gram statistics collected from over 1 300 000 000 words are presented. Eventhough they were collected from various good sources, they contain several types of errors. The paper focuses on the process of partly supervised correction of the n- grams. Types of errors are described as well as our software allowing efficient and fast corrections.
Usage of language models in automatic speech recognition systems usually give significant quality and certainty improvement of recognition outcomes. On the other hand, wrongly chosen or trained language models can result in serious degradation not only recognition quality but also overall performance of the system. Proper selection of language material, system parameters and representation of the...
A speech recognition system based on HTK for Polish is presented. It was trained on 365 utterances, all spoken by 26 males. The features of Polish with respect to speech recognition are described. Some aspects of speech recognition differ in comparison to English. Errors in recognition were analysed in details in an attempt to find reasons and scenarios of wrong recognitions.
Segmenting the speech signals on the basis of time-frequency analysis is the most natural approach. Boundaries are located in places where energy of some frequency subband rapidly changes. Speech segmentation method which bases on discrete wavelet transform, the resulting power spectrum and its derivatives is presented. This information allows to locate the boundaries of phonemes. A statistical classification...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.