Wyniki wyszukiwania dla: Kanishka Rao

Pozycje od 1 do 9 spośród 9 wyników

artykuł

The diagnostic value of cytology in parotid Warthin's tumors: International multicenter series

Daniele Borsetto, Jonathan M. Fussey, Diego Cazzador, Joel Smith, więcej

Head & Neck > 42 > 8 > 2215 - 2216

artykuł

The diagnostic value of cytology in parotid Warthin's tumors: international multicenter series

Daniele Borsetto, Jonathan M. Fussey, Diego Cazzador, Joel Smith, więcej

Head & Neck > 42 > 3 > 522 - 529

Introduction Warthin's tumor (WT) is a common benign salivary gland neoplasm with a negligible risk of malignant transformation. However, there is a risk of malignant tumors being misdiagnosed as WT on cytology and inappropriately managed conservatively. Methods Patients from nine centers in Italy and the United Kingdom undergoing parotid surgery for cytologically diagnosed WT were included in...

rozdział

Multi-accent speech recognition with hierarchical grapheme based models

Kanishka Rao, Hasim Sak

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4815 - 4819

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We train grapheme-based acoustic models for speech recognition using a hierarchical recurrent neural network architecture with connectionist temporal classification (CTC) loss. The models learn to align utterances with phonetic transcriptions in a lower layer and graphemic transcriptions in the final layer in a multi-task learning setting. Using the grapheme predictions from a hierarchical model trained...

rozdział

Flat start training of CD-CTC-SMBR LSTM RNN acoustic models

Kanishka Rao, Andrew Senior, Hasim Sak

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5405 - 5409

ICASSP 2016 - 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We present a recipe for training acoustic models with context dependent (CD) phones from scratch using recurrent neural networks (RNNs). First, we use the connectionist temporal classification (CTC) technique to train a model with context independent (CI) phones directly from the written-domain word transcripts by aligning with all possible phonetic verbalizations. Then, we devise a mechanism to generate...

rozdział

Personalized speech recognition on mobile devices

Ian McGraw, Rohit Prabhavalkar, Raziel Alvarez, Montse Gonzalez Arenas, więcej

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5955 - 5959

ICASSP 2016 - 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We describe a large vocabulary speech recognition system that is accurate, has low latency, and yet has a small enough memory and computational footprint to run faster than real-time on a Nexus 5 Android smartphone. We employ a quantized Long Short-Term Memory (LSTM) acoustic model trained with connectionist temporal classification (CTC) to directly predict phoneme targets, and further reduce its...

rozdział

Acoustic modelling with CD-CTC-SMBR LSTM RNNS

Andrew, Hasim Sak, Felix de Chaumont Quitry, Tara Sainath, więcej

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) > 604 - 609

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

This paper describes a series of experiments to extend the application of Context-Dependent (CD) long short-term memory (LSTM) recurrent neural networks (RNNs) trained with Connectionist Temporal Classification (CTC) and sMBR loss. Our experiments, on a noisy, reverberant voice search task, include training with alternative pronunciations and the application to child speech recognition; combination...

rozdział

Grapheme-to-phoneme conversion using Long Short-Term Memory recurrent neural networks

Kanishka Rao, Fuchun Peng, Hasim Sak, Francoise Beaufays

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4225 - 4229

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Grapheme-to-phoneme (G2P) models are key components in speech recognition and text-to-speech systems as they describe how words are pronounced. We propose a G2P model based on a Long Short-Term Memory (LSTM) recurrent neural network (RNN). In contrast to traditional joint-sequence based G2P approaches, LSTMs have the flexibility of taking into consideration the full context of graphemes and transform...

rozdział

Automatic pronunciation verification for speech recognition

Kanishka Rao, Fuchun Peng, Francoise Beaufays

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5162 - 5166

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pronunciations for words are a critical component in an automated speech recognition system (ASR) as mis-recognitions may be caused by missing or inaccurate pronunciations. The need for high quality pronunciations has recently motivated data-driven techniques to generate them [1]. We propose a data-driven and language-independent framework for verification of such pronunciations to further improve...

rozdział

Learning acoustic frame labeling for speech recognition with recurrent neural networks

Hasim Sak, Andrew Senior, Kanishka Rao, Ozan Irsoy, więcej

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4280 - 4284

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We explore alternative acoustic modeling techniques for large vocabulary speech recognition using Long Short-Term Memory recurrent neural networks. For an acoustic frame labeling task, we compare the conventional approach of cross-entropy (CE) training using fixed forced-alignments of frames and labels, with the Connectionist Temporal Classification (CTC) method proposed for labeling unsegmented sequence...

Opcje filtrowania

Data publikacji

Ustaw własny zakres dat

Typ publikacji

książka (7)
artykuł (2)

Słowa kluczowe

CTC (5)
SPEECH RECOGNITION (4)
LSTM (3)
ACOUSTIC MODELING (2)
ACOUSTICS (2)
DICTIONARIES (2)
RNN (2)
TRAINING (2)
ACOUSTIC MODELLING (1)
ADAPTATION MODELS (1)
CONNECTIONIST TEMPORAL CLASSIFICATION (1)
CONTEXT MODELING (1)
DATA MODELS (1)
DEEP NEURAL NETWORKS (1)
EMBEDDED SPEECH RECOGNITION (1)
EXTRACAPSULAR DISSECTION (1)
FACIAL NERVE PALSY (1)
FINE NEEDLE ASPIRATION CYTOLOGY (1)
FLAT START (1)
G2P (1)
GOLD (1)
GOOGLE (1)
GRAPHEME (1)
HIDDEN MARKOV MODELS (1)
INDEXES (1)
JOINTS (1)
KNOWLEDGE TRANSFER (1)
LONG SHORT TERM MEMORY (1)
LSTM RNN (1)
MEASUREMENT (1)
MODEL COMPRESSION (1)
NEURAL NETWORKS (1)
PAROTID (1)
PAROTIDECTOMY (1)
PRONUNCIATION (1)
QUANTIZATION (1)
RECURRENT NEURAL NETWORKS (1)
SEQUENCE DISCRIMINATIVE TRAINING (1)
SPEECH (1)
WARTHIN'S TUMOR (1)
więcej

Zbiór danych

ieee (7)
Wiley (2)

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania dla: Kanishka Rao

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Typ publikacji

Słowa kluczowe

Zbiór danych

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu