The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
sequence during training. This paper explores the design of an ASR-free end-to-end system for text query-based keyword search (KWS) from speech trained with minimal supervision. Our E2E KWS system consists of three sub-systems. The first sub-system is a recurrent neural network (RNN)-based acoustic auto-encoder trained to
In this work, keyword search (KWS) is based on a symbolic index that uses posteriorgram representation of the speech data. For each query, sum-to-one normalization or keyword specific thresholding is applied to the search results. The effect of these methods on the proposed KWS system is investigated. Results are
This paper describes experiments for audio clips comparison based on spoken context. The spoken content is obtained using automatic speech recognition. The social tags that are available for most of the audio clips are used as keywords. These keywords are mapped to the spoken transcription representing the audio clips
In particular for “low resource” Keyword Search (KWS) and Speech-to-Text (STT) tasks, more untranscribed test data may be available than training data. Several approaches have been proposed to make this data useful during system development, even when initial systems have Word Error Rates (WER) above 70
prototype system demonstrates our latest development on automatic speech recognition, keyword spotting, personalized text-to-speech synthesis and visual speech synthesis. The second demo exhibits a virtual concert with immersive audio effects. Through our virtual auditory technology, wearing simple earphones, listeners are
The paper discusses the overall design scheme of intelligent information service platform based on automatic speech recognition and geographical information system, with the carrier of opening multimedia operation platform. This platform can implement good communication between human and the system through keyword
Speech Recognition (ASR), Multilingual Text-to-Speech system with other enhanced features like keyword search facility, Intelligent/ Auto customization in accordance with user and paper independent classified headings. The integration of ASR enables user to operate the system in complete hands free mode.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.