Search results

chapter

Radio-browsing for developmental monitoring in Uganda

Raghav Menon, Armin Saeb, Hugh Cameron, William Kibira, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5795 - 5799

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

automatic speech recognisers using HMM/GMM, SGMM and DNN/HMM acoustic models as keyword spotters. We present the first results indicating promising performance of the radio-browsing system.

chapter

Investigations on byte-level convolutional neural networks for language modeling in low resource speech recognition

Kazuki Irie, Pavel Golik, Ralf Schluter, Hermann Ney

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5740 - 5744

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

characters, even on syllabic alphabets like Amharic. In addition, we report improvements in word error rate from rescoring lattices and evaluate keyword search performance on several languages.

chapter

Acoustic data-driven pronunciation lexicon generation for logographic languages

Guoguo Chen, Daniel Povey, Sanjeev Khudanpur

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5350 - 5354

ICASSP 2016 - 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Cantonese speech recognition and keyword search tasks. Experiments show that starting from an expert lexicon of only 1K words, we are able to generate a lexicon that works reasonably well when compared with an expert-crafted lexicon of 5K words.

chapter

Voice-activity home care system

Oscal T.-C. Chen, Y. H. Tsai, C. W. Su, P. C. Kuo, more

2016 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI) > 110 - 113

2016 IEEE-EMBS 3rd International Conference on Biomedical and Health Informatics (BHI)

This work proposes a voice-activity home care system which can construct a life log associated with voices at home. Accordingly, the techniques of sound-pressure-level calculation, abnormal sound detection, noise reduction, text-independent speaker recognition and keyword spotting are developed. In abnormal sound

chapter

Improving data selection for low-resource STT and KWS

Thiago Fraga-Silva, Antoine Laurent, Jean-Luc Gauvain, Lori Lamel, more

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) > 153 - 159

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

This paper extends recent research on training data selection for speech transcription and keyword spotting system development. Selection techniques were explored in the context of the IARPA-Babel Active Learning (AL) task for 6 languages. Different selection criteria were considered with the goal of improving over a

chapter

Semi-supervised training in low-resource ASR and KWS

Florian Metze, Ankur Gandhe, Yajie Miao, Zaid Sheikh, more

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4699 - 4703

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In particular for “low resource” Keyword Search (KWS) and Speech-to-Text (STT) tasks, more untranscribed test data may be available than training data. Several approaches have been proposed to make this data useful during system development, even when initial systems have Word Error Rates (WER) above 70

chapter

Investigation of multilingual deep neural networks for spoken term detection

K. M. Knill, M. J. F. Gales, S. P. Rath, P. C. Woodland, more

2013 IEEE Workshop on Automatic Speech Recognition and Understanding > 138 - 143

2013 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)

an additional held-out target language. STT gains achieved through using multilingual bottleneck features in a Tandem configuration are shown to also apply to keyword search (KWS). Further improvements in both STT and KWS were observed by incorporating language questions into the Tandem GMM-HMM decision trees for the

chapter

A LDA-based method for automatic tagging of Youtube videos

Mohamed Morchid, Georges Linares

2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS) > 1 - 4

2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)

This article presents a method for automatic tagging of Youtube videos. The proposed method combines an automatic speech recognition (ASR) system, that extracts the spoken contents, and a keyword extraction component that aims at finding a small set of tags representing a video. In order to improve the robustness of

chapter

SVM-based knowledge topic identification toward the autonomous knowledge acquisition

Keedong Yoo

2011 IEEE 9th International Symposium on Applied Machine Intelligence and Informatics (SAMI) > 149 - 154

2011 IEEE 9th International Symposium on Applied Machine Intelligence and Informatics (SAMI)

One of the most serious problems that conventional knowledge management (KM) encompasses has been pointed out tardy and ineffective acquisition of knowledge. To resolve this problem, knowledge must be autonomously acquired according to its context of use by applying the technique of keyword extraction in machine

chapter

Speech and Auditory Interfaces for Ubiquitous, Immersive and Personalized Applications

Lei Xie, Wenhuai Zhao, Xiangzeng Zhou, Xiaohai Tian, more

2010 7th International Conference on Ubiquitous Intelligence&Computing and 7th International Conference on Autonomic&Trusted Computing > 503 - 505

2010 7th International Conference on Ubiquitous Intelligence & Computing and 7th International Conference on Autonomic & Trusted Computing (UIC/ATC 2010)

prototype system demonstrates our latest development on automatic speech recognition, keyword spotting, personalized text-to-speech synthesis and visual speech synthesis. The second demo exhibits a virtual concert with immersive audio effects. Through our virtual auditory technology, wearing simple earphones, listeners are

chapter

Location Aware Question Answering Based Product Searching in Mobile Handheld Devices

S A Hossain, A S M M Rahman, T T Tran, A E Saddik

2010 IEEE/ACM 14th International Symposium on Distributed Simulation and Real Time Applications > 189 - 195

2010 IEEE/ACM 14th International Symposium on Distributed Simulation and Real Time Applications (DS-RT 2010)

, the system carry out conversation with the user to explicitly understand his/her needs and accordingly filters search results for display. The conversation between the system and the user is based on word co-occurrence keyword extraction and Artificial Intelligence Markup Language (AIML) technique. As per initial

chapter

Web-based real time content processing and monitoring service for digital TV broadcast

Zhu Liu, David Gibbon, Behzad Shahraray

2010 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB) > 1 - 6

2010 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB 2010)

extraction, text normalization, keyword extraction, shot boundary detection, face detection and recognition, and near duplicate keyframe detection. These processing components detect a rich set of metadata information, which is collected by the video monitoring server. On a web interface, users can tune to different digital TV

chapter

Language model adaptation using WWW documents obtained by utterance-based queries

Andreas Tsiartas, Panayiotis Georgiou, Shrikanth Narayanan

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 5406 - 5409

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

results in up to 1.1% absolute Word Error Rate (WER) improvement as compared to keyword-based approaches. The proposed approach reduces the WER by 6.3% absolute in our experiments, compared to an in-domain LM without considering any Web data.

chapter

Hierarchical audio-visual cue integration framework for activity analysis in intelligent meeting rooms

S.T. Shivappa, M.M. Trivedi, B.D. Rao

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops > 107 - 114

2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

recognition using audio and visual cues. The novelty lies in putting together the tasks such that they can provide relevant information to one another. We evaluate the performance of our system and present results for tasks such as keyword spotting and tracking re-identification on real-world meeting scenes collected in our

chapter

Gaze-contingent asr for spontaneous, conversational speech: An evaluation

N. Cooke, M. Russell

2008 IEEE International Conference on Acoustics, Speech and Signal Processing > 4433 - 4436

ICASSP 2008. IEEE International Conference on Acoustic, Speech and Signal Processes

increase in keyword spotting accuracy. The key finding was that performance improvements observed were due to increased recognition accuracy for words associated with the visual field but not the current focus of visual attention.

chapter

Towards unsupervised online word clustering

H. Brandl, F. Joublin, C. Goerick

2008 IEEE International Conference on Acoustics, Speech and Signal Processing > 5073 - 5076

ICASSP 2008. IEEE International Conference on Acoustic, Speech and Signal Processes

system for unsupervised word-clustering, which is able to recognize and learn the structure of speech online in a unified framework. To do so we've extended HMM-based filler-free keyword spotting with acoustic model acquisition. To evaluate and control the dynamics of the combined acquisition-recognition process we propose

chapter

The Development of Automatic Speech Recognition Software for Portable Devices

Myoung-Wan Koo, Joon-Ki Choi, Young-Myoung Kim

First International Conference on Advances in Computer-Human Interaction > 59 - 62

2008 First International Conference on Advances in Computer Human Interaction - ACHI '08

, we propose a method for detecting keyword and rejecting out-of-vocabularies (OOV). It consists of filler-modeling technique and utterance verification. And finally, we implement the ASR software on PDAs (Samsung SPH-M4300 and HP iPAQ-RW6100), one kind of portable devices. It works in 54.7% of real-time with the

chapter

Automatic speech recognition based on weighted minimum classification error (W-MCE) training method

Qiang Fu, Biing-Hwang Juang

2007 IEEE Workshop on Automatic Speech Recognition&Understanding (ASRU) > 278 - 283

2007 IEEE Workshop on Automatic Speech Recognition and Understanding

error. However, this prevalent performance metric is not desirable in many practical applications. For example, the cost of "recognition" error is required to be differentiated in keyword spotting systems. In this paper, we propose an extended framework for the speech recognition problem with non-uniform classification

chapter

Automatic Query Expansion for News Video Retrieval

Yun Zhai, Jingen Liu, M. Shah

2006 IEEE International Conference on Multimedia and Expo > 965 - 968

2006 IEEE International Conference on Multimedia and Expo

round, keyword histograms are automatically generated for the refinement of the search query, such that the reformulated query fits better to the target topic. We have also developed an image-based refinement module, which uses the region analysis of the video key-frames. SR-tree like indexing structure is constructed for

chapter

An Implementation of Viterbi Algorithm on GPU

Dan Zhang, Rongcai Zhao, Lin Han, Tao Wang, more

2009 First International Conference on Information Science and Engineering > 121 - 124

2009 1st International Conference on Information Science and Engineering (ICISE 2009)

General purpose computation based on GPU is a hot topic for research in recent years. The paper presents the parallel implementation of Viterbi algorithm on GPU based on features of GPU and characteristics of Viterbi algorithm in keyword spotting system. The results of examination by using NVIDIA 9600 GT GPU show that

INFONA - science communication portal

Search results

Radio-browsing for developmental monitoring in Uganda

Investigations on byte-level convolutional neural networks for language modeling in low resource speech recognition

Acoustic data-driven pronunciation lexicon generation for logographic languages

Voice-activity home care system

Improving data selection for low-resource STT and KWS

Semi-supervised training in low-resource ASR and KWS

Investigation of multilingual deep neural networks for spoken term detection

A LDA-based method for automatic tagging of Youtube videos

SVM-based knowledge topic identification toward the autonomous knowledge acquisition

Speech and Auditory Interfaces for Ubiquitous, Immersive and Personalized Applications

Location Aware Question Answering Based Product Searching in Mobile Handheld Devices

Web-based real time content processing and monitoring service for digital TV broadcast

Language model adaptation using WWW documents obtained by utterance-based queries

Hierarchical audio-visual cue integration framework for activity analysis in intelligent meeting rooms

Gaze-contingent asr for spontaneous, conversational speech: An evaluation

Towards unsupervised online word clustering

The Development of Automatic Speech Recognition Software for Portable Devices

Automatic speech recognition based on weighted minimum classification error (W-MCE) training method

Automatic Query Expansion for News Video Retrieval

An Implementation of Viterbi Algorithm on GPU

Filter options

Publication date

Content availability

Publication type

Keywords

Data set

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Data set

Reporting an error / abuse

Sending the report failed

Accessibility options