The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Performances of some training techniques of automatic speech recognition system are compared in this paper. Speech recognition accuracy was used as measure of performance. Different kinds of outdoor and indoor noise were used for studying. It is shown the superiority of training on noised speech methods over the competitive technique of training on clear speech. It has been found that training by...
Development of automatic speech recognition (ASR) systems robust to late reverberation action is urgent task. It is well known that a late reverberation reduction algorithm used as ASR pre-processor demands prior estimation of reverberation time. Blind reverberation time measurements are less accurate than ones for known room impulse response (RIR) direct measurements. As result, it is naturally expect...
In this paper, the results of quality and intelligibility assessment of speech masked by stationary and nonstationary noise have been proposed. Subjective speech quality assessment technique has been used to show that white noise masking ability is lower than one for pink and even for brown noise when SNR is less than 0 dB. Two algorithms of nonstationary noise forming have been proposed. They are...
In this paper, subjective and objective estimators of the quality of speech and music signals subjected to phase distortion are compared, and mapping between objective and subjective quality estimates is realized. It was found that the phase distortion of speech signals is perceived stronger than ones for musical signals. Two types of phase distortion are considered: 1) low-frequency signal components...
In this paper, two techniques of automatic speech recognition system training on noised speech are compared with technique of training on clean speech. The comparing has been made by means of speech recognition accuracy measure, with usage of fourteen kinds of noise. These were noises of household appliances and computers, street and transport, teaching rooms and lobbies. The superiority degree of...
Noise and late reverberation reduction algorithms were studied using objective speech quality and speech recognition accuracy (Acc%) measures. Negative consequences of excessive noise and late reverberation reduction for automatic speech recognition had been demonstrated. Study of speech quality measures showed that only few of them were in good agreement with Acc%. It was shown that Acc% may be in...
Effect of “decision-directed”, maximum likelihood and “rough” a priori signal-to-noise ratio (SNR) assessment methods and their parameters on the noise reduction algorithms quality had been considered. It was shown that “rough” assessment method which doesn't contain averaging procedure is optimal in terms of recognition accuracy (Acc%) for SNR > 15 … 17 dB.
A new method of classification of a speaker’s gender based on cumulant coefficients is proposed. The effect of an additive noise and measurement error of classification signs on accuracy of classification is analyzed. The expediency of construction of an adaptive system of classification operating with considering of masking of a speech signal by noise is shown. Comparison of the proposed method of...
Refined recommendations for choosing optimal, in the sense of automatic speech recognition (ASR) accuracy maximum, parameters of the late reverberation suppression technique, have been proposed in this paper. It was shown that best value of boundary between early reflections and late reverberation approximates to 100 ms for ASR systems. It was shown also that, when estimating late reverberation power...
Enhancement of speech distorted by reverberation is issue of the day. The problem has been actively studied in the last decade. However, it is still extremely difficult to find clear recommendations on choice of boundary value between early reflections and late reverberation, optimal in sense of such criteria as speech recognition accuracy and speech quality. Another problem is getting of simple pre-processor...
In this paper analytical and experimental researches of a formant-modulation method of speech intelligibility estimation are made. Comfortable for engineering applications conditions of the achievements of required measuring exactness are got.
In this paper researches of a rapid version of a formant-modulation method of speech intelligibility estimation are made. It is shown that the version makes it possible to reduce estimation time in fourteen times.
Three objective methods of evaluation of speech intelligibility had been confronted: a method of partial signal-to-noise ratios, formant method and modulation method. The possibility of advantages join of formant and modulation methods is shown.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.