Search results

chapter

Relationship between perception and production of English vowels by Chinese English learners

Aihui Zhang, Hui Feng, Siyu Wang, Jianwu Dang

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

In previous studies, no consensus has been reached on the existence of significant correlation between perception and production. A large number of empirical studies have been done upon first and second languages from different language families. However, few studies were carried out on the perception-production relation of Chinese English learners. Therefore, in the current study, under the theoretical...

chapter

Text-based sentential stress prediction using continuous lexical embedding for Mandarin speech synthesis

Yibin Zheng, Ya Li, Zhengqi Wen, Bin Liu, more

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

Stress is an important parameter for prosody processing in speech synthesis. However, it is not easy to stress from text analysis due to the complicated information. In this paper, we explore the novel use of the continuous lexical embedding and bidirectional long short-term memory recurrent neural network (BLSTM) model into sentential stress prediction for Mandarin speech synthesis. We look at augmenting...

chapter

DNN-based unit selection using frame-sized speech segments

Zhi-Ping Zhou, Zhen-Hua Ling

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

This paper presents a deep neural network (DNN)-based unit selection method for waveform concatenation speech synthesis using frame-sized speech segments. In this method, three DNNs are adopted to calculate target costs and concatenation costs respectively for selecting frame-sized candidate units. The first DNN is built in the same way as the DNN-based statistical parametric speech synthesis, which...

chapter

Rich punctuations prediction using large-scale deep learning

Xueyang Wu, Su Zhu, Yue Wu, Kai Yu

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

Punctuation plays an important role in language processing. However, automatic speech recognition systems only output plain word sequences. It is then of interest to predict punctuations on plain word sequences. Previous works have focused on using lexical features or prosodic cues captured from small corpus to predict simple punctuations. Compared with simple punctuations, rich punctuations provide...

chapter

Multi-task joint-learning for robust voice activity detection

Yimeng Zhuang, Sibo Tong, Maofan Yin, Yanmin Qian, more

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

Model based VAD approaches have been widely used and achieved success in practice. These approaches usually cast VAD as a frame-level classification problem and employ statistical classifiers, such as Gaussian Mixture Model (GMM) or Deep Neural Network (DNN) to assign a speech/silence label for each frame. Due to the frame independent assumption classification, the VAD results tend to be fragile....

chapter

On training bi-directional neural network language model with noise contrastive estimation

Tianxing He, Yu Zhang, Jasha Droppo, Kai Yu

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

Although uni-directional recurrent neural network language model(RNNLM) has been very successful, it's hard to train a bi-directional RNNLM properly due to the generative nature of language model. In this work, we propose to train bi-directional RNNLM with noise contrastive estimation(NCE), since the properities of NCE training will help the model to acheieve sentence-level normalization. Experiments...

chapter

Rapid speaker adaptation based on D-code extracted from BLSTM-RNN in LVCSR

Shaofei Xue, Zhijie Yan, Zhiying Huang, Lirong Dai

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

Recently, several fast speaker adaptation methods have been proposed for the hybrid DNN-HMM models based on the so-called discriminative speaker codes (SC) [1-3] and applied to unsupervised speaker adaptation in speech recognition [4]. It has been demonstrated that the SC based methods are quite effective in adapting DNNs even when only a very small amount of adaptation data is available. However,...

chapter

Learning objective agent behavior using a data-driven modeling approach

Farzad Kamrani, Linus J. Luotsinen, Rikke Amilde Lovlid

2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC) > 2175 - 2181

2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC)

This paper presents a data-driven approach towards the modeling of agent behaviors in a full-fledged, commercial off-the-shelf simulation milieu for tactical military training. The modeling approach employs machine learning to identify behavioral rules and patterns in data. Potential advantages of this approach are that it may improve modeling efficiency and, perhaps more importantly, increase the...

chapter

Ridiculously Expensive Watches and Surprisingly Many Reviewers: A Study of Irony

Pavel Savov, Radoslaw Nielek

2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI) > 725 - 729

2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)

Irony is something most people can tell is therewhen they see it, but it is not so easy to define, let alone detectautomatically. In this paper we describe the construction of abalanced corpus of ironic vs. serious watch reviews and show thepromising results achieved by classifiers trained on this corpusin predicting the presence of irony or lack thereof in productreviews from a manually labeled corpus...

chapter

From Opinion Lexicons to Sentiment Classification of Tweets and Vice Versa: A Transfer Learning Approach

Felipe Bravo-Marquez, Eibe Frank, Bernhard Pfahringer

2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI) > 145 - 152

2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)

Message-level and word-level polarity classification are two popular tasks in Twitter sentiment analysis. They have been commonly addressed by training supervised models from labelled data. The main limitation of these models is the high cost of data annotation. Transferring existing labels from a related problem domain is one possible solution for this problem. In this paper, we propose a simple...

chapter

Learning Text-Line Localization with Shared and Local Regression Neural Networks

Bastien Moysset, Jerome Louradour, Christopher Kermorvant, Christian Wolf

2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR) > 1 - 6

2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR)

Text line detection and localisation is a crucial step for full page document analysis, but still suffers from heterogeneity of real life documents. In this paper, we present a novel approach for text line localisation based on Convolutional Neural Networks and Multidimensional Long Short-Term Memory cells as a regressor in order to predict the coordinates of the text line bounding boxes directly...

chapter

Recurrent Neural Networks for Transmission Opportunity Forecasting

Paulo A. L. Ferreira, Silas S. Fernandes, Rodrigo R. Bezerra, Marcus V. Lamar, more

2016 IEEE 13th International Conference on Mobile Ad Hoc and Sensor Systems (MASS) > 382 - 383

2016 IEEE 13th International Conference on Mobile Ad Hoc and Sensor Systems (MASS)

One of the major challenges in opportunistic networks is the correct identification of a transmission opportunity and its corresponding duration. In this work, recurrent neural network structures are investigated for transmission opportunity forecast. The proposed method is based on in-channel spectrum sensing and the use of Elman recurrent neural network to model the occupation of the channel. The...

chapter

Class-Based Contextual Modeling for Handwritten Arabic Text Recognition

Irfan Ahmad, Gernot A. Fink

2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR) > 554 - 559

2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR)

In this paper we will present our investigations related to contextual modeling for HMM-based handwritten Arabic text recognition. We will, first, discuss the justifications and the need for contextual modeling for handwritten Arabic text recognition. Next, we will discuss the issues related to contextual modeling for Arabic text recognition. Finally, we will present our novel class-based contextual...

chapter

Integration of data sources in an automatic corrector of Arabic texts

Jaouad Outifa, Si Lhoussain Aouragh, Said El Alaoui Ouatik

2016 4th IEEE International Colloquium on Information Science and Technology (CiSt) > 344 - 348

2016 4th IEEE International Colloquium on Information Science and Technology (CIST)

Unlike French and English, the richness and ambiguity of written Arabic texts cause a great deal of errors. The purpose of this article is to resolve issues of tolerance of some errors in Arabic texts and to develop an automatic detection system as well as a correction system of those errors. This work represents a combination of the Levenshtein Distance (LD) and bi-context language models based on...

chapter

Handwriting Recognition with Large Multidimensional Long Short-Term Memory Recurrent Neural Networks

Paul Voigtlaender, Patrick Doetsch, Hermann Ney

2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR) > 228 - 233

2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR)

Multidimensional long short-term memory recurrent neural networks achieve impressive results for handwriting recognition. However, with current CPU-based implementations, their training is very expensive and thus their capacity has so far been limited. We release an efficient GPU-based implementation which greatly reduces training times by processing the input in a diagonal-wise fashion. We use this...

chapter

Line-of-Sight Stroke Graphs and Parzen Shape Context Features for Handwritten Math Formula Representation and Symbol Segmentation

Lei Hu, Richard Zanibbi

2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR) > 180 - 186

2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR)

This paper presents a new representation for handwritten math formulae: a Line-of-Sight (LOS) graph over handwritten strokes, computed using stroke convex hulls. Experimental results using the CROHME 2012 and 2014 datasets show that LOS graphs capture the visual structure of handwritten formulae better than commonly used graphs such as Time-series, Minimum Spanning Trees, and k-Nearest Neighbor graphs...

chapter

Controlling Swarms by Visual Demonstration

Karan K. Budhraja, Tim Oates

2016 IEEE 10th International Conference on Self-Adaptive and Self-Organizing Systems (SASO) > 1 - 10

2016 IEEE 10th International Conference on Self-Adaptive and Self-Organizing Systems (SASO)

Agent-based modeling is a paradigm of modeling dynamic systems of interacting agents that are individually governed by specified behavioral rules. Training a model of such agents to produce an emergent behavior by specification of the emergent (as opposed to agent) behavior is easier from a demonstration perspective. While many approaches involve manual behavior specification via code or reliance...

chapter

The open-set problem in acoustic scene classification

Daniele Battaglino, Ludovick Lepauloux, Nicholas Evans

2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC) > 1 - 5

2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC)

Acoustic scene classification (ASC) has attracted growing research interest in recent years. Whereas the previous work has investigated closed-set classification scenarios, the predominant ASC application is open-set in nature. The contributions of the paper are (i) the first investigation of ASC in an open-set scenario, (ii) the formulation of open-set ASC as a detection problem, (iii) a classifier...

chapter

Glyph miner: A system for efficiently extracting glyphs from early prints in the context of OCR

Benedikt Budig, Thomas C. van Dijk, Felix Kirchner

2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL) > 31 - 34

2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)

While off-the-shelf OCR systems work well on many modern documents, the heterogeneity of early prints provides a significant challenge. To achieve good recognition quality, existing software must be “trained” specifically to each particular corpus. This is a tedious process that involves significant user effort. In this paper we demonstrate a system that generically replaces a common part of the training...

chapter

Zoom: A Serious Games Intervention Design Model - When Games Alone Are Not Enough!

Anthony L. Brooks

2016 8th International Conference on Games and Virtual Worlds for Serious Applications (VS-GAMES) > 1 - 6

2016 8th International Conference on Games and Virtual Worlds for Serious Applications (VS-Games)

This article posits reflections from the author's mature body of work that resulted in sizeable national (Denmark) and international (European) funded projects, a patent, commercial product, and a Serious Games company. Main focus is on sharing a two-stage in-action and on-action emergent model for evaluating the use of ICT (serious games and creative expression) in healthcare and learning intervention...

INFONA - science communication portal

Search results

Relationship between perception and production of English vowels by Chinese English learners

Text-based sentential stress prediction using continuous lexical embedding for Mandarin speech synthesis

DNN-based unit selection using frame-sized speech segments

Rich punctuations prediction using large-scale deep learning

Multi-task joint-learning for robust voice activity detection

On training bi-directional neural network language model with noise contrastive estimation

Rapid speaker adaptation based on D-code extracted from BLSTM-RNN in LVCSR

Learning objective agent behavior using a data-driven modeling approach

Ridiculously Expensive Watches and Surprisingly Many Reviewers: A Study of Irony

From Opinion Lexicons to Sentiment Classification of Tweets and Vice Versa: A Transfer Learning Approach

Learning Text-Line Localization with Shared and Local Regression Neural Networks

Recurrent Neural Networks for Transmission Opportunity Forecasting

Class-Based Contextual Modeling for Handwritten Arabic Text Recognition

Integration of data sources in an automatic corrector of Arabic texts

Handwriting Recognition with Large Multidimensional Long Short-Term Memory Recurrent Neural Networks

Line-of-Sight Stroke Graphs and Parzen Shape Context Features for Handwritten Math Formula Representation and Symbol Segmentation

Controlling Swarms by Visual Demonstration

The open-set problem in acoustic scene classification

Glyph miner: A system for efficiently extracting glyphs from early prints in the context of OCR

Zoom: A Serious Games Intervention Design Model - When Games Alone Are Not Enough!

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options