Search results

Items from 41 to 60 out of 2,639 results

chapter

Residual neural networks for speech recognition

Hari Krishna Vydana, Anil Kumar Vuppala

2017 25th European Signal Processing Conference (EUSIPCO) > 543 - 547

2017 25th European Signal Processing Conference (EUSIPCO)

Recent developments in deep learning methods have greatly influenced the performances of speech recognition systems. In a Hidden Markov model-Deep neural network (HMM-DNN) based speech recognition system, DNNs have been employed to model senones (context dependent states of HMM), where HMMs capture the temporal relations among senones. Due to the use of more deeper networks significant improvement...

chapter

Polish whispery speech recognition — Minimum sampling frequency

Piotr Kozierski, Talar Sadalla, Szymon Drgas, Adam Dabrowski, more

2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR) > 611 - 615

2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR)

The article presents studies on the automatic whispery speech recognition. In the performed research a new corpus with whispery speech has been used. It has been checked how is the speech recognition quality changing at variables sampling frequency and signal frame length. It has been found that the optimal sampling frequency of whispery speech is about 32–48 kHz, while the optimal signal frame length...

chapter

Real-time monitoring system for potentially dangerous activities detection

Aleksandra Postawka

2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR) > 1005 - 1008

2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR)

Cognitive impairments are an unavoidable community problem. People suffering from such diseases need all day long attention with varying care difficulty depending on the type of disorder. What makes care harder in the case of autism is the frequent occurrence of self aggressive behaviors. The monitoring system is supposed to detect such situations and differentiate them from similar normal activities...

chapter

VTLN-warped Gaussian posteriorgram for QbE-STD

Maulik C. Madhavi, Hemant A. Patil

2017 25th European Signal Processing Conference (EUSIPCO) > 563 - 567

2017 25th European Signal Processing Conference (EUSIPCO)

Vocal Tract Length Normalization (VTLN) is a very important speaker normalization technique for speech recognition tasks. In this paper, we propose the use of Gaussian posteriorgram of VTLN-warped spectral features for a Query-by-Example Spoken Term Detection (QbE-STD). This paper presents the use of a Gaussian Mixture Model (GMM) framework for estimation of VTLN warping factor. This GMM framework...

chapter

Development of multilingual phone recognition system for Indian languages

K E Manjunath, K. Sreenivasa Rao, Dinesh Babu Jayagopi

2017 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES) > 1 - 6

2017 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES)

In this paper, the development of Multilingual Phone Recognition System (MPRS) in the context of Indian languages is described. MPRS is a language independent Phone Recognition System (PRS) that could recognise the phonetic units present in a speech utterance of any language. We have developed two Bilingual and a quadrilingual PRS using four Indian languages — Kannada, Telugu, Bengali, and Odia. International...

chapter

The impact of vocabulary size and language model order on the polish whispery speech recognition

Piotr Kozierski, Talar Sadalla, Szymon Drgas, Adam Dabrowski, more

2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR) > 616 - 621

2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR)

The article presents studies on the automatic whispery speech recognition. In the performed research a new corpus with whispery speech has been used. The aim of studies presented in this paper was to check, how the vocabulary size and the language model order influence on the speech recognition quality. It has been concluded that even using recordings with 5,000 different words only it is possible...

chapter

Accurate Localization Using LTE Signaling Data

Lei Ni, Yuxin Wang, Haoyang Tang, Zhao Yin, more

2017 IEEE International Conference on Computer and Information Technology (CIT) > 268 - 273

2017 IEEE International Conference on Computer and Information Technology (CIT)

In this paper, we propose a novel localization algorithm using LTE signaling data. Specifically, we use TA (Timing Advance) and RSRP (Reference Signal Receiving Power) data that are required in LTE standard and already available in current LTE systems. The combination of (TA, RSRP) is used as a signature, and one can expect that different locations will have distinctive signatures. Our real world...

chapter

A prognostic method for Radar transmitter based on DHMM

Pengfei Yu, Zhenwei Zhou, Liye Cheng, Yudong Lu

2017 Prognostics and System Health Management Conference (PHM-Harbin) > 1 - 5

2017 Prognostics and System Health Management Conference (PHM-Harbin)

For the randomness and uncertainty of fault for Radar transmitter, a prognostic method based on discrete Hidden Markov Model (DHMM) is proposed. In the paper, three monitoring parameters of transmitter are collected and a discrete Hidden Markov Model is established. In order to have a fast convergence, The Baum-Welch (B-W) algorithm is used for training of DHMM. Finally, the state probability transition...

chapter

Fault diagnosis method based on diffusion maps and hidden Markov model for TE process

Baoqi Liu, Jinxue Xu, Yuan Li

2017 36th Chinese Control Conference (CCC) > 7253 - 7258

2017 36th Chinese Control Conference (CCC)

To reduce data-storage costs and enhance high accuracy of industrial process fault detection, a data driven fault diagnosis method is proposed based on diffusion maps and hidden Markov model. Firstly, the correlation dimension of sample data is calculated. Secondly, the high-dimensional eigenvectors are extracted into low-dimensional manifold space by diffusion maps. Finally, the low-dimensional eigenvectors...

chapter

Creativity: Generating Diverse Questions Using Variational Autoencoders

Unnat Jain, Ziyu Zhang, Alexander Schwing

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5415 - 5424

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Generating diverse questions for given images is an important task for computational education, entertainment and AI assistants. Different from many conventional prediction techniques is the need for algorithms to generate a diverse set of plausible questions, which we refer to as creativity. In this paper we propose a creative algorithm for visual question generation which combines the advantages...

chapter

Learning and Refining of Privileged Information-Based RNNs for Action Recognition from Depth Sequences

Zhiyuan Shi, Tae-Kyun Kim

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4684 - 4693

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Existing RNN-based approaches for action recognition from depth sequences require either skeleton joints or hand-crafted depth features as inputs. An end-to-end manner, mapping from raw depth maps to action classes, is non-trivial to design due to the fact that: 1) single channel map lacks texture thus weakens the discriminative power, 2) relatively small set of depth training data. To address these...

chapter

Re-Sign: Re-Aligned End-to-End Sequence Modelling with Deep Recurrent CNN-HMMs

Oscar Koller, Sepehr Zargaran, Hermann Ney

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3416 - 3424

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This work presents an iterative re-alignment approach applicable to visual sequence labelling tasks such as gesture recognition, activity recognition and continuous sign language recognition. Previous methods dealing with video data usually rely on given frame labels to train their classifiers. However, looking at recent data sets, these labels often tend to be noisy which is commonly overseen. We...

chapter

Recurrent Convolutional Neural Networks for Continuous Sign Language Recognition by Staged Optimization

Runpeng Cui, Hu Liu, Changshui Zhang

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1610 - 1618

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This work presents a weakly supervised framework with deep neural networks for vision-based continuous sign language recognition, where the ordered gloss labels but no exact temporal locations are available with the video of sign sentence, and the amount of labeled sentences for training is limited. Our approach addresses the mapping of video segments to glosses by introducing recurrent convolutional...

chapter

Unsupervised learning of non-Gaussian mixtures with temporal dependencies

Gonzalo Safont, Addisson Salazar, Luis Vergara

2017 40th International Conference on Telecommunications and Signal Processing (TSP) > 399 - 402

2017 40th International Conference on Telecommunications and Signal Processing (TSP)

Classification methods typically make use only of labeled data, in what is known as supervised learning. In some applications, however, labeled data is either scarce or costly to obtain. For these applications, unsupervised or semisupervised learning are adequate, since they will be able to use unlabeled data. This work proposes a new method for unsupervised and semisupervised learning of non-Gaussian...

chapter

Asynchronous Temporal Fields for Action Recognition

Gunnar A. Sigurdsson, Santosh Divvala, Ali Farhadi, Abhinav Gupta

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5650 - 5659

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Actions are more than just movements and trajectories: we cook to eat and we hold a cup to drink from it. A thorough understanding of videos requires going beyond appearance modeling and necessitates reasoning about the sequence of activities, as well as the higher-level constructs such as intentions. But how do we model and reason about these? We propose a fully-connected temporal CRF model for reasoning...

chapter

Weakly Supervised Action Learning with RNN Based Fine-to-Coarse Modeling

Alexander Richard, Hilde Kuehne, Juergen Gall

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1273 - 1282

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We present an approach for weakly supervised learning of human actions. Given a set of videos and an ordered list of the occurring actions, the goal is to infer start and end frames of the related action classes within the video and to train the respective action classifiers without any need for hand labeled frame boundaries. To address this task, we propose a combination of a discriminative representation...

chapter

Transition Forests: Learning Discriminative Temporal Transitions for Action Recognition and Detection

Guillermo Garcia-Hernando, Tae-Kyun Kim

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 407 - 415

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

A human action can be seen as transitions between ones body poses over time, where the transition depicts a temporal relation between two poses. Recognizing actions thus involves learning a classifier sensitive to these pose transitions as well as to static poses. In this paper, we introduce a novel method called transitions forests, an ensemble of decision trees that both learn to discriminate static...

chapter

AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos

Amlan Kar, Nishant Rai, Karan Sikka, Gaurav Sharma

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5699 - 5708

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We propose a novel method for temporally pooling frames in a video for the task of human action recognition. The method is motivated by the observation that there are only a small number of frames which, together, contain sufficient information to discriminate an action class present in a video, from the rest. The proposed method learns to pool such discriminative and informative frames, while discarding...

chapter

Incorporating Copying Mechanism in Image Captioning for Learning Novel Objects

Ting Yao, Yingwei Pan, Yehao Li, Tao Mei

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5263 - 5271

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Image captioning often requires a large set of training image-sentence pairs. In practice, however, acquiring sufficient training pairs is always expensive, making the recent captioning models limited in their ability to describe objects outside of training corpora (i.e., novel objects). In this paper, we present Long Short-Term Memory with Copying Mechanism (LSTM-C) — a new architecture...

chapter

On Human Motion Prediction Using Recurrent Neural Networks

Julieta Martinez, Michael J. Black, Javier Romero

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4674 - 4683

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Human motion modelling is a classical problem at the intersection of graphics and computer vision, with applications spanning human-computer interaction, motion synthesis, and motion prediction for virtual and augmented reality. Following the success of deep learning methods in several computer vision tasks, recent work has focused on using deep recurrent neural networks (RNNs) to model human motion,...

Keywords:
TRAINING
HIDDEN MARKOV MODELS

Publication date

Set your own date range

Content availability

Available (2,627)
None (12)

Keywords

SPEECH (916)
SPEECH RECOGNITION (756)
FEATURE EXTRACTION (722)
ACOUSTICS (426)
HIDDEN MARKOV MODEL (334)
ACCURACY (312)
COMPUTATIONAL MODELING (307)
DATABASES (290)
DATA MODELS (286)
DATA MINING (233)
SUPPORT VECTOR MACHINES (228)
TRAINING DATA (212)
HMM (211)
HANDWRITING RECOGNITION (184)
TESTING (183)
NATURAL LANGUAGE PROCESSING (175)
MATHEMATICAL MODEL (161)
ARTIFICIAL NEURAL NETWORKS (151)
VECTORS (146)
NEURAL NETWORKS (137)
LEARNING (ARTIFICIAL INTELLIGENCE) (136)
ADAPTATION MODELS (135)
SPEECH PROCESSING (132)
SPEECH SYNTHESIS (129)
CONTEXT (116)
MEL FREQUENCY CEPSTRAL COEFFICIENT (116)
DECODING (111)
IMAGE SEGMENTATION (111)
SPEAKER RECOGNITION (109)
PROBABILITY (105)
AUTOMATIC SPEECH RECOGNITION (104)
HUMANS (99)
ADAPTATION MODEL (97)
TRAJECTORY (94)
CLASSIFICATION ALGORITHMS (93)
VOCABULARY (93)
GESTURE RECOGNITION (89)
GAUSSIAN PROCESSES (85)
MAXIMUM LIKELIHOOD ESTIMATION (85)
MARKOV PROCESSES (83)
CHARACTER RECOGNITION (82)
ERROR ANALYSIS (82)
PATTERN RECOGNITION (81)
TEXT ANALYSIS (80)
ESTIMATION (79)
DICTIONARIES (77)
VITERBI ALGORITHM (77)
NOISE (76)
PREDICTIVE MODELS (76)
IMAGE RECOGNITION (75)
MACHINE LEARNING (74)
PATTERN CLASSIFICATION (73)
OPTIMIZATION (72)
VISUALIZATION (71)
KERNEL (67)
ROBUSTNESS (66)
TAGGING (66)
CLUSTERING ALGORITHMS (63)
STATISTICAL ANALYSIS (63)
CONTEXT MODELING (62)
SHAPE (62)
FACE RECOGNITION (61)
JOINTS (57)
NOISE MEASUREMENT (57)
RECURRENT NEURAL NETWORKS (57)
IMAGE CLASSIFICATION (56)
NEURONS (56)
SUPPORT VECTOR MACHINE (55)
LABELING (53)
TRANSFORMS (53)
CONDITIONAL RANDOM FIELDS (52)
SENSORS (52)
STANDARDS (52)
ALGORITHM DESIGN AND ANALYSIS (51)
HANDWRITTEN CHARACTER RECOGNITION (51)
NEURAL NETS (50)
SEMANTICS (49)
BAYES METHODS (47)
DETECTORS (47)
FACE (47)
IMAGE SEQUENCES (47)
PRINCIPAL COMPONENT ANALYSIS (47)
CONFERENCES (46)
CAMERAS (45)
PROBABILISTIC LOGIC (43)
TEXT RECOGNITION (43)
DISCRIMINATIVE TRAINING (42)
NATURAL LANGUAGES (42)
COMPUTER VISION (41)
ENTROPY (41)
INFORMATION RETRIEVAL (41)
SIGNAL TO NOISE RATIO (41)
HEURISTIC ALGORITHMS (40)
IMAGE MOTION ANALYSIS (40)
LATTICES (40)
PATTERN CLUSTERING (40)
PIXEL (40)
BIOLOGICAL SYSTEM MODELING (39)
more

INFONA - science communication portal

Search results

Residual neural networks for speech recognition

Polish whispery speech recognition — Minimum sampling frequency

Real-time monitoring system for potentially dangerous activities detection

VTLN-warped Gaussian posteriorgram for QbE-STD

Development of multilingual phone recognition system for Indian languages

The impact of vocabulary size and language model order on the polish whispery speech recognition

Accurate Localization Using LTE Signaling Data

A prognostic method for Radar transmitter based on DHMM

Fault diagnosis method based on diffusion maps and hidden Markov model for TE process

Creativity: Generating Diverse Questions Using Variational Autoencoders

Learning and Refining of Privileged Information-Based RNNs for Action Recognition from Depth Sequences

Re-Sign: Re-Aligned End-to-End Sequence Modelling with Deep Recurrent CNN-HMMs

Recurrent Convolutional Neural Networks for Continuous Sign Language Recognition by Staged Optimization

Unsupervised learning of non-Gaussian mixtures with temporal dependencies

Asynchronous Temporal Fields for Action Recognition

Weakly Supervised Action Learning with RNN Based Fine-to-Coarse Modeling

Transition Forests: Learning Discriminative Temporal Transitions for Action Recognition and Detection

AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos

Incorporating Copying Mechanism in Image Captioning for Learning Novel Objects

On Human Motion Prediction Using Recurrent Neural Networks

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options