Search results

chapter

Modeling long temporal contexts in convolutional neural network-based phone recognition

Laszlo Toth

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4575 - 4579

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The deep neural network component of current hybrid speech recognizers is trained on a context of consecutive feature vectors. Here, we investigate whether the time span of this input can be extended by splitting it up and modeling it in smaller chunks. One method for this is to train a hierarchy of two networks, while the less well-known split temporal context (STC) method models the left and right...

chapter

A deep recurrent approach for acoustic-to-articulatory inversion

Peng Liu, Quanjie Yu, Zhiyong Wu, Shiyin Kang, more

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4450 - 4454

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

To solve the acoustic-to-articulatory inversion problem, this paper proposes a deep bidirectional long short term memory recurrent neural network and a deep recurrent mixture density network. The articulatory parameters of the current frame may have correlations with the acoustic features many frames before or after. The traditional pre-designed fixed-length context window may be either insufficient...

chapter

Neuron sparseness versus connection sparseness in deep neural network for large vocabulary speech recognition

Jian Kang, Cheng Lu, Meng Cai, Wei-Qiang Zhang, more

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4954 - 4958

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Exploiting sparseness in deep neural networks is an important method for reducing the computational cost. In this paper, we study neuron sparseness in deep neural networks for acoustic modeling. For the feed-forward stage, we only activate neurons whose input values are larger than a given threshold, and set the outputs of inactive nodes to zero. Thus, only a few nonzero outputs are fed to the next...

chapter

Enhancing automatically discovered multi-level acoustic patterns considering context consistency with applications in spoken term detection

Cheng-Tao Chung, Wei-Ning Hsu, Cheng-Yi Lee, Lin-Shan Lee

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5231 - 5235

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper presents a novel approach for enhancing the multiple sets of acoustic patterns automatically discovered from a given corpus. In a previous work it was proposed that different HMM configurations (number of states per model, number of distinct models) for the acoustic patterns form a two-dimensional space. Multiple sets of acoustic patterns automatically discovered with the HMM configurations...

chapter

Regularization of context-dependent deep neural networks with context-independent multi-task training

Peter Bell, Steve Renals

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4290 - 4294

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The use of context-dependent targets has become standard in hybrid DNN systems for automatic speech recognition. However, we argue that despite the use of state-tying, optimising to context-dependent targets can lead to over-fitting, and that discriminating between arbitrary tied context-dependent targets may not be optimal. We propose a multitask learning method where the network jointly predicts...

chapter

Adaptive statistical utterance phonetization for French

Gwenole Lecorve, Damien Lolive

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4864 - 4868

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Traditional utterance phonetization methods concatenate pronunciations of uncontextualized constituent words. This approach is too weak for some languages, like French, where transitions between words imply pronunciation modifications. Moreover, it makes it difficult to consider global pronunciation strategies, for instance to model a specific speaker or a specific accent. To overcome these problems,...

chapter

Context dependent phone models for LSTM RNN acoustic modelling

Andrew Senior, Hasim Sak, Izhak Shafran

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4585 - 4589

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Long Short Term Memory Recurrent Neural Networks (LSTM RNNs), combined with hidden Markov models (HMMs), have recently been show to outperform other acoustic models such as Gaussian mixture models (GMMs) and deep neural networks (DNNs) for large scale speech recognition. We argue that using multi-state HMMs with LSTM RNN acoustic models is an unnecessary vestige of GMM-HMM and DNN-HMM modelling since...

chapter

Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks

Tara N. Sainath, Oriol Vinyals, Andrew Senior, Hasim Sak

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4580 - 4584

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Both Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) have shown improvements over Deep Neural Networks (DNNs) across a wide variety of speech recognition tasks. CNNs, LSTMs and DNNs are complementary in their modeling capabilities, as CNNs are good at reducing frequency variations, LSTMs are good at temporal modeling, and DNNs are appropriate for mapping features to a more separable...

chapter

Deep neural networks employing Multi-Task Learning and stacked bottleneck features for speech synthesis

Zhizheng Wu, Cassia Valentini-Botinhao, Oliver Watts, Simon King

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4460 - 4464

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Deep neural networks (DNNs) use a cascade of hidden representations to enable the learning of complex mappings from input to output features. They are able to learn the complex mapping from text-based linguistic features to speech acoustic features, and so perform text-to-speech synthesis. Recent results suggest that DNNs can produce more natural synthetic speech than conventional HMM-based statistical...

chapter

A contextual model for semantic video structuring

Bruno Janvier, Eric Bruno, Stephane Marchand-Maillet, Thierry Pun

2005 13th European Signal Processing Conference > 1 - 4

2005 13th European Signal Processing Conference

The problem of semantic video structuring is vital for automated management of large video collections. The goal is to automatically extract from the raw data the inner structure of a video collection; so that a whole new range of applications to browse and search video collections can be derived out of this high-level segmentation. To reach this goal, we exploit techniques that consider the full...

chapter

Exploiting phonetic and phonological similarities as a first step for robust speech recognition

Julie Mauclair, Daniel Aioanei, Julie Carson-Berndsen

2009 17th European Signal Processing Conference > 1750 - 1754

2009 17th European Signal Processing Conference

This paper presents two speech recognition systems which use the notion of phonetic and phonological similarity to improve the robustness of phoneme recognition. The first recognition system, YASPER, uses phonetic feature extraction engines to identify phonemes based on overlap relations between phonetic features. The second system uses the CMU Sphinx 3.7 decoder based on statistical context-dependent...

chapter

Using context overlays to analyse the role of a priori information with Process Mining

Paolo Pileggi, Alejandro Rivero-Rodriquez, Ossi Nykanen

2015 Annual IEEE Systems Conference (SysCon) Proceedings > 639 - 644

2015 9th Annual IEEE International Systems Conference (SysCon)

Notwithstanding the significant advances in context-aware computing in pervasive computing and self-adaptive systems, there is still much more to be desired in providing better context services. The number of sensors deployed world-wide increases very rapidly. The Internet of Things, amongst others, generates vast amounts of data of many different data types. How data are used is essential to improve...

chapter

Combining Syntactic Information with HMM for Term Extraction

Hua-Shan Pan, Ji-Yuan Zhao

2015 2nd International Conference on Information Science and Control Engineering > 170 - 173

2015 2nd International Conference on Information Science and Control Engineering (ICISCE)

Aiming at the problem of Chinese thesaurus construction, we propose a method of using HMM to extract new terms from academic literature to expand automatically entry-words for Chinese thesaurus. This method converts the new terms extraction problem to a sequence labelling problem. It uses HMM fully integrated lexical information and syntactic information of new terms, as well as local context information,...

chapter

Map-based context dependent tone recognition method of Chinese speech

Li Ming, Liu Jian, Yu Tiecheng

2000 10th European Signal Processing Conference > 1 - 3

2000 10th European Signal Processing Conference

This paper presents a new context dependent tone recognition method. First we suggest that there be more than five tone modes in Chinese continuous speech. We get all new tone modes by grouping all tone feature vectors to a specific number of categories. Secondly, we recognize a sentence with the new tone modes and get the new tone sequence. Finally, we find out each original tone of the sentence...

chapter

Towards reputation measurement in online social networks

Mouna El Marrakchi, Mostafa Bellafkih, Hicham Bensaid

2015 Intelligent Systems and Computer Vision (ISCV) > 1 - 8

2015 Intelligent Systems and Computer Vision (ISCV)

E-Reputation is gaining increasing attention among companies. Many brands are making deep invests in managing their image across the web and virtual communities. Thereby, marketers try to access to large volumes of data generated by e-reputation analysis. Their main issue is detecting what is said about their brand and how it can impact their business. As social mediacontributes in assessing opinions...

chapter

PEMAR: A pervasive middleware for activity recognition with smart phones

Prakash Vaka, Feichen Shen, Mayanka Chandrashekar, Yugyung Lee

2015 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops) > 409 - 414

2015 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops)

The growing affordability of smart phones and mobile devices has only added to this trend by encouraging prolonged durations of inactivity. In this paper, we present a middleware, called the Pervasive Middleware for Activity Recognition (PEMAR) that aims to increase the level of physical activity by creating a middleware for active games on mobile devices. For the PEMAR application, we present a human...

chapter

TnT tagger with fuzzy rule based learning

Alen Jacob, Amal Babu, P C Reghu Raj

2015 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES) > 1 - 5

2015 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES)

TnT is an efficient statistical Parts-of-speech (POS) Tagger based on Hidden Markov Model. TnT stands for Trigrams‘n’Tags. Viterbi algorithm is used for finding the best tag sequence for a given observation sequence of words. TnT performs well on known word sequences. But, the performance degrades with increase in the number of unknown words. In this paper, we propose a method to overcome this performance...

chapter

A hybrid epidemic model for antinormative behavior in online social networks

Cong Liao, Anna Squicciarini, Christopher Griffin, Sarah Rajtmajer

2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) > 1563 - 1564

2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)

In this paper, we describe a novel approach to investigate negative behavior dynamics in online social networks as epidemic phenomena. We present a finite-state machine model for time-varying epidemic dynamics, and validate this model with experiments over a large dataset of Youtube commentaries, indicating how different epidemic patterns of behavior can be tied to specific interaction patterns among...

chapter

Emotion analysis of children's stories with context information

Zhengchen Zhang, Minghui Dong, Shuzhi Sam Ge

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 7

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In this paper, we analyse the emotion of children's stories in sentence level by considering the context information. We demonstrate that the emotion of a sentence is not only dependent on its content, but also affected by its neighbours in a story. A Hidden Markov Model (HMM) based method is proposed to model the emotion sequence and to detect whether a sentence is neutral or not. We show the important...

chapter

HMM-based Thai speech synthesis using unsupervised stress context labeling

Decha Moungsri, Tomoki Koriyama, Takao Kobayashi

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 4

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

This paper describes an approach to HMM-based Thai speech synthesis using stress context. It has been shown that context related to stressed/unstressed syllable information (stress context) significantly improves the tone correctness of the synthetic speech, but there is a problem of requiring a manual context labeling process in tone modeling. To reduce costs for the stress context labeling, we propose...

INFONA - science communication portal

Search results

Modeling long temporal contexts in convolutional neural network-based phone recognition

A deep recurrent approach for acoustic-to-articulatory inversion

Neuron sparseness versus connection sparseness in deep neural network for large vocabulary speech recognition

Enhancing automatically discovered multi-level acoustic patterns considering context consistency with applications in spoken term detection

Regularization of context-dependent deep neural networks with context-independent multi-task training

Adaptive statistical utterance phonetization for French

Context dependent phone models for LSTM RNN acoustic modelling

Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks

Deep neural networks employing Multi-Task Learning and stacked bottleneck features for speech synthesis

A contextual model for semantic video structuring

Exploiting phonetic and phonological similarities as a first step for robust speech recognition

Using context overlays to analyse the role of a priori information with Process Mining

Combining Syntactic Information with HMM for Term Extraction

Map-based context dependent tone recognition method of Chinese speech

Towards reputation measurement in online social networks

PEMAR: A pervasive middleware for activity recognition with smart phones

TnT tagger with fuzzy rule based learning

A hybrid epidemic model for antinormative behavior in online social networks

Emotion analysis of children's stories with context information

HMM-based Thai speech synthesis using unsupervised stress context labeling

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options