Wyniki wyszukiwania

rozdział

Verification based on palm vein by estimating wavelet coefficient with autoregressive model

Fereshte Yazdani, Mehran Emadi Andani

2017 2nd Conference on Swarm Intelligence and Evolutionary Computation (CSIEC) > 118 - 122

2017 2nd Conference on Swarm Intelligence and Evolutionary Computation (CSIEC)

Biometric is a pattern recognition system that automatically identifies people according to their physiologic and behavioral properties. Among the physiologic properties, hand has a special place so that all features of hand like palm lines, inner knuckles, external knuckles and geometry could be used. More recently, the usage of blood vessels pattern in the palm, in addition to the high acceptability,...

rozdział

Lombard speech synthesis using long short-term memory recurrent neural networks

Bajibabu Bollepalli, Manu Airaksinen, Paavo Alku

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5505 - 5509

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In statistical parametric speech synthesis (SPSS), a few studies have investigated the Lombard effect, specifically by using hidden Markov model (HMM)-based systems. Recently, artificial neural networks have demonstrated promising results in SPSS, specifically by using long short-term memory recurrent neural networks (LSTMs). The Lombard effect, however, has not been studied in the LSTM-based speech...

rozdział

Domain adaptation of DNN acoustic models using knowledge distillation

Taichi Asami, Ryo Masumura, Yoshikazu Yamaguchi, Hirokazu Masataki, więcej

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5185 - 5189

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Constructing deep neural network (DNN) acoustic models from limited training data is an important issue for the development of automatic speech recognition (ASR) applications that will be used in various application-specific acoustic environments. To this end, domain adaptation techniques that train a domain-matched model without overfitting by lever-aging pre-constructed source models are widely...

rozdział

Cumulative moving averaged bottleneck speaker vectors for online speaker adaptation of CNN-based acoustic models

Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, więcej

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5175 - 5179

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Adapting acoustic models to speakers have shown to greatly improve performance for many tasks. Among the adaptation approaches, exploiting auxiliary features characterizing speakers or environments has received great attention because they allow rapid adaptation, i.e. adaptation with limited amount of speech data such as a single utterance. However, the auxiliary features are usually computed in batch...

rozdział

Multi-accent speech recognition with hierarchical grapheme based models

Kanishka Rao, Hasim Sak

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4815 - 4819

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We train grapheme-based acoustic models for speech recognition using a hierarchical recurrent neural network architecture with connectionist temporal classification (CTC) loss. The models learn to align utterances with phonetic transcriptions in a lower layer and graphemic transcriptions in the final layer in a multi-task learning setting. Using the grapheme predictions from a hierarchical model trained...

rozdział

The microsoft 2016 conversational speech recognition system

W. Xiong, J. Droppo, X. Huang, F. Seide, więcej

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5255 - 5259

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We describe Microsoft's conversational speech recognition system, in which we combine recent developments in neural-network-based acoustic and language modeling to advance the state of the art on the Switchboard recognition task. Inspired by machine learning ensemble techniques, the system uses a range of convolutional and recurrent neural networks. I-vector modeling and lattice-free MMI training...

rozdział

Speech recognition in unseen and noisy channel conditions

Vikramjit Mitra, Horacio Franco, Chris Bartels, Julien van Hout, więcej

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5215 - 5219

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Speech recognition in varying background conditions is a challenging problem. Acoustic condition mismatch between training and evaluation data can significantly reduce recognition performance. For mismatched conditions, data-adaptation techniques are typically found to be useful, as they expose the acoustic model to the new data condition(s). Supervised adaptation techniques usually provide substantial...

rozdział

Adapting and controlling DNN-based speech synthesis using input codes

Hieu-Thi Luong, Shinji Takaki, Gustav Eje Henter, Junichi Yamagishi

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4905 - 4909

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Methods for adapting and controlling the characteristics of output speech are important topics in speech synthesis. In this work, we investigated the performance of DNN-based text-to-speech systems that in parallel to conventional text input also take speaker, gender, and age codes as inputs, in order to 1) perform multi-speaker synthesis, 2) perform speaker adaptation using small amounts of target-speaker...

rozdział

Visual features for context-aware speech recognition

Abhinav Gupta, Yajie Miao, Leonardo Neves, Florian Metze

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5020 - 5024

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Automatic transcriptions of consumer generated multi-media content such as “Youtube” videos still exhibit high word error rates. Such data typically occupies a very broad domain, has been recorded in challenging conditions, with cheap hardware and a focus on the visual modality, and may have been post-processed or edited.

rozdział

Extended low-rank plus diagonal adaptation for deep and recurrent neural networks

Yong Zhao, Jinyu Li, Kshitiz Kumar, Yifan Gong

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5040 - 5044

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Recently, the low-rank plus diagonal (LRPD) adaptation was proposed for speaker adaptation of deep neural network (DNN) models. The LRPD restructures the adaptation matrix as a superposition of a diagonal matrix and a product of two low-rank matrices. In this paper, we extend the LRPD adaptation into the subspace-based approach to further reduce the speaker-dependent (SD) footprint. We apply the extended...

rozdział

Multi-task Curriculum Transfer Deep Learning of Clothing Attributes

Qi Dong, Shaogang Gong, Xiatian Zhu

2017 IEEE Winter Conference on Applications of Computer Vision (WACV) > 520 - 529

2017 IEEE Winter Conference on Applications of Computer Vision (WACV)

Recognising detailed clothing characteristics (finegrained attributes) in unconstrained images of people inthe-wild is a challenging task for computer vision, especially when there is only limited training data from the wild whilst most data available for model learning are captured in well-controlled environments using fashion models (well lit, no background clutter, frontal view, high-resolution)...

rozdział

Image super-resolution via weighted random forest

Zhi-Song Liu, Wan-Chi Siu, Jun-Jie Huang

2017 IEEE International Conference on Industrial Technology (ICIT) > 1019 - 1023

2017 IEEE International Conference on Industrial Technology (ICIT)

This paper proposes a novel learning-based image super-resolution via a weighted random forest model (SWRF). The proposed method uses the LR-HR training data to train a random forest model. The underlying idea of this approach is to use several decision trees to classify the training data based on a simple splitting threshold value at each class. A linear regression model is learnt to map the relationship...

rozdział

Learning Attributes from Human Gaze

Nils Murrugarra-Llerena, Adriana Kovashka

2017 IEEE Winter Conference on Applications of Computer Vision (WACV) > 510 - 519

2017 IEEE Winter Conference on Applications of Computer Vision (WACV)

While semantic visual attributes have been shown useful for a variety of tasks, many attributes are difficult to model computationally. One of the reasons for this difficulty is that it is not clear where in an image the attribute lives. We propose to tackle this problem by involving humans more directly in the process of learning an attribute model. We ask humans to examine a set of images to determine...

rozdział

First-Person Action Decomposition and Zero-Shot Learning

Yun C. Zhang, Yin Li, James M. Rehg

2017 IEEE Winter Conference on Applications of Computer Vision (WACV) > 121 - 129

2017 IEEE Winter Conference on Applications of Computer Vision (WACV)

In this work, we decompose a first-person action into verb and noun. We then study how the coupling of an action's constituent verb and noun affects the learners' ability to learn them separately and to combine them to perform recognition. We compare different information fusion methods on conventional action recognition and zero-shot learning, of which the latter is a strong indication of the feature's...

rozdział

Deep-Crowd-Label: A deep-learning based crowd-assisted system for location labeling

Mohammad-Mahdi Moazzami, Jasvinder Singh, Vijay Srinivasan, Guoliagn Xing

2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops) > 195 - 200

2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)

Semantic labels are crucial parts of many location-based applications. Previous efforts in location-based systems have mostly paid attention to achieve high accuracy in localization or navigation, with the assumption that the mapping between the locations and the semantic labels are given or will be done manually. In this paper, we propose a system called Deep-Crowd-Label that automatically assigns...

rozdział

Transfer learning in long-text keystroke dynamics

Hayreddin Ceker, Shambhu Upadhyaya

2017 IEEE International Conference on Identity, Security and Behavior Analysis (ISBA) > 1 - 6

2017 IEEE International Conference on Identity, Security and Behavior Analysis (ISBA)

Conventional machine learning algorithms based on keystroke dynamics build a classifier from labeled data in one or more sessions but assume that the dataset at the time of verification exhibits the same distribution. Ideally, the keystroke data collected at a session is expected to be an invariant representation of an individual's behavioral biometrics. In real applications, however, the data is...

rozdział

Convolutional Neural Network using a threshold predictor for multi-label speech act classification

Guanghao Xu, Hyunjung Lee, Myoung-Wan Koo, Jungyun Seo

2017 IEEE International Conference on Big Data and Smart Computing (BigComp) > 126 - 130

2017 IEEE International Conference on Big Data and Smart Computing (BigComp)

Regarding the spoken language understanding (SLU) pilot task of the Dialog State Tracking Challenge 5 (DSTC5), it is required to classify label sets of speech acts on human-to-human dialogues. In this paper, we propose a multi-label classification model with the assistance of algorithm adaptation method. To be specific, a Convolutional Neural Network (CNN) model on top of pre-trained word vectors...

rozdział

Post-Silicon Validation: Automatic Characterization of RF Device Nonidealities via Iterative Learning Experiments on Hardware

Barry Muldrey, Sabyasachi Deyati, Abhijit Chatterjee

2017 30th International Conference on VLSI Design and 2017 16th International Conference on Embedded Systems (VLSID) > 403 - 408

2017 30th International Conference on VLSI Design and 2017 16th International Conference on Embedded Systems (VLSID)

Recent studies show that increasing numbers of design bugs are escaping to post-silicon due to the complexity of advanced designs and the lack of adequate verification tools that can validate complex electrical interactions between electrical subsystems on an integrated circuit. In this paper, we present a novel tool for post-silicon validation of mixed-signal/RF circuits through cooperative test...

rozdział

From perceptrons to deep neural networks

Peter Lacko

2017 IEEE 15th International Symposium on Applied Machine Intelligence and Informatics (SAMI) > 169 - 172

2017 IEEE 15th International Symposium on Applied Machine Intelligence and Informatics (SAMI)

Deep neural networks are intensively researched field of artificial intelligence. Big companies like Google, Microsoft, Baidu or Facebook are supporting research and development in this field. The recent victory over human player in the game of Go points to a huge potential of this approach. Machine learning approaches based on deep learning techniques bring significant gain over existing methods...

rozdział

A full band adaptive Harmonic Model based Speaker Identity Transformation using Radial Basis Function

Ankita Chadha, Jagannath Nirmal

2017 11th International Conference on Intelligent Systems and Control (ISCO) > 217 - 223

2017 11th International Conference on Intelligent Systems and Control (ISCO)

Speaker Transformation adapts the speaker dependent characteristics of the source speaker according to that of a target speaker, so that it is perceived like the target speaker. Speaker Transformation is generally carried out using speech analysis-synthesis system. The full-band adaptive Harmonic Model (a-HM) based analysis-synthesis has ability to produce a high quality resynthesized speech. Thus...

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania

Verification based on palm vein by estimating wavelet coefficient with autoregressive model

Lombard speech synthesis using long short-term memory recurrent neural networks

Domain adaptation of DNN acoustic models using knowledge distillation

Cumulative moving averaged bottleneck speaker vectors for online speaker adaptation of CNN-based acoustic models

Multi-accent speech recognition with hierarchical grapheme based models

The microsoft 2016 conversational speech recognition system

Speech recognition in unseen and noisy channel conditions

Adapting and controlling DNN-based speech synthesis using input codes

Visual features for context-aware speech recognition

Extended low-rank plus diagonal adaptation for deep and recurrent neural networks

Multi-task Curriculum Transfer Deep Learning of Clothing Attributes

Image super-resolution via weighted random forest

Learning Attributes from Human Gaze

First-Person Action Decomposition and Zero-Shot Learning

Deep-Crowd-Label: A deep-learning based crowd-assisted system for location labeling

Transfer learning in long-text keystroke dynamics

Convolutional Neural Network using a threshold predictor for multi-label speech act classification

Post-Silicon Validation: Automatic Characterization of RF Device Nonidealities via Iterative Learning Experiments on Hardware

From perceptrons to deep neural networks

A full band adaptive Harmonic Model based Speaker Identity Transformation using Radial Basis Function

Opcje filtrowania

Data publikacji

Słowa kluczowe

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Słowa kluczowe

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu