Search results for: K Prahallad

Items from 1 to 11 out of 11 results

chapter

Fundamental frequency generation for whisper-to-audible speech conversion

M. Janke, M. Wand, T. Heistermann, T. Schultz, more

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2579 - 2583

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this work, we address the issues involved in whisper-to-audible speech conversion. Spectral mapping techniques using Gaussian mixture models or Artificial Neural Networks borrowed from voice conversion have been applied to transform whisper spectral features to normally phonated audible speech. However, the modeling and generation of fundamental frequency (F0) and its contour in the converted speech...

article

Segmentation of Monologues in Audio Books for Building Synthetic Voices

K Prahallad, A W Black

IEEE Transactions on Audio, Speech, and Language Processing > 2011 > 19 > 5 > 1444 - 1449

One of the issues in using audio books for building a synthetic voice is the segmentation of large speech files. The use of the Viterbi algorithm to obtain phone boundaries on large audio files fails primarily because of huge memory requirements. Earlier works have attempted to resolve this problem by using large vocabulary speech recognition system employing restricted dictionary and language model...

chapter

Significance of anchor speaker segments for constructing extractive audio summaries of broadcast news

Sree Harsha Yella, V Varma, K Prahallad

2010 IEEE Spoken Language Technology Workshop > 13 - 18

2010 IEEE Spoken Language Technology Workshop (SLT 2010)

Analysis of human reference summaries of broadcast news showed that humans give preference to anchor speaker segments while constructing a summary. Therefore, we exploit the role of anchor speaker in a news show by tracking his/her speech to construct indicative/informative extractive audio summaries. Speaker tracking is done by Bayesian information criterion (BIC) technique. The proposed technique...

chapter

A multilingual screen reader in Indian languages

E.V. Raghavendra, K. Prahallad

2010 National Conference On Communications (NCC) > 1 - 5

2010 National Conference on Communications (NCC 2010)

Screen reader is a form of assistive technology to help visually impaired people to use or access the computer and Internet. So far, it has remained expensive and within the domain of English (and some foreign) language computing. For Indian languages this development is limited by: availability of Text-to-Speech (TTS) system in Indian languages, support for reading glyph based font encoded text,...

chapter

Speech synthesis using artificial neural networks

E.V. Raghavendra, P. Vijayaditya, K. Prahallad

2010 National Conference On Communications (NCC) > 1 - 5

2010 National Conference on Communications (NCC 2010)

Statistical parametric synthesis becoming more popular in recent years due to its adaptability and size of the synthesis. Mel cepstral coefficients, fundamental frequency (f₀) and duration are the main components for synthesizing speech in statistical parametric synthesis. The current study mainly concentrates on mel cesptral coefficients. Durations and f₀ are taken from the original data. In this...

chapter

Voice conversion using Artificial Neural Networks

S. Desai, E.V. Raghavendra, B. Yegnanarayana, A.W. Black, more

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 3893 - 3896

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

In this paper, we propose to use artificial neural networks (ANN) for voice conversion. We have exploited the mapping abilities of ANN to perform mapping of spectral features of a source speaker to that of a target speaker. A comparative study of voice conversion using ANN and the state-of-the-art Gaussian mixture model (GMM) is conducted. The results of voice conversion evaluated using subjective...

chapter

Speech synthesis using approximate matching of syllables

E.V. Raghavendra, B. Yegnanarayana, K. Prahallad

2008 IEEE Spoken Language Technology Workshop > 37 - 40

2008 IEEE Workshop on Spoken Language Technology. SLT 2008

In this paper we propose a technique for a syllable based speech synthesis system. While syllable based synthesizers produce better sounding speech than diphone and phone, the coverage of all syllables is a non-trivial issue. We address the issue of coverage of syllables through approximating the syllable when the required syllable is not found. To verify our hypothesis, we conducted perceptual studies...

chapter

Global syllable set for building speech synthesis in Indian languages

E.V. Raghavendra, S. Desai, B. Yegnanarayana, A.W. Black, more

2008 IEEE Spoken Language Technology Workshop > 49 - 52

2008 IEEE Workshop on Spoken Language Technology. SLT 2008

Indian languages are syllabic in nature where many syllables are found common across its languages. This motivates us to build a global syllable set by combining multiple language syllables to build a synthesizer which can borrow units from a different language when the required syllable is not found. Such synthesizer make use of speech database in different languages spoken by different speakers,...

chapter

AANN-HMM models for speaker verification and speech recognition

S. Joshi, K. Prahallad, K. Prahallad, B. Yegnanarayana

2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence) > 2681 - 2688

2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence)

Pattern classification is an important task in speech recognition and speaker verification. Given the feature vectors of an input the goal is to capture the characteristics of these features unique to each class. This paper deals with exploring Auto Associative Neural Network (AANN) models for the task of speaker verification and speech recognition. We show that AANN models produce comparable performance...

chapter

Significance of early tagged contextual graphemes in grapheme based speech synthesis and recognition systems

G.K. Anumanchipalli, K. Prahallad, A.W. Black

2008 IEEE International Conference on Acoustics, Speech and Signal Processing > 4645 - 4648

ICASSP 2008. IEEE International Conference on Acoustic, Speech and Signal Processes

In this paper we present our argument that context information could be used in early stages i.e., during the definition of mapping of the words into sequence of graphemes. We show that the early tagged contextual graphemes play a significant role in improving the performance of grapheme based speech synthesis and speech recognition systems.

chapter

Sub-Phonetic Modeling For Capturing Pronunciation Variations For Conversational Speech Synthesis

K. Prahallad, A.W. Black, R. Mosur

2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings > 1 > I

2006 IEEE International Conference on Acoustics, Speech, and Signal Processing

In this paper we address the issue of pronunciation modeling for conversational speech synthesis. We experiment with two different HMM topologies (fully connected state model and forward connected state model) for sub-phonetic modeling to capture the deletion and insertion of sub-phonetic states during speech production process. We show that the experimented HMM topologies have higher log likelihood...

Filter options

Publication date

Set your own date range

Publication type

book (10)
article (1)

Keywords

SPEECH (8)
SPEECH SYNTHESIS (7)
ARTIFICIAL NEURAL NETWORKS (5)
SPEECH RECOGNITION (5)
BUILDINGS (4)
DATABASES (4)
NEURAL NETS (4)
DATA MINING (3)
FEATURE EXTRACTION (3)
HIDDEN MARKOV MODELS (3)
SPEECH PROCESSING (3)
TRAINING (3)
ACOUSTICS (2)
CEPSTRAL ANALYSIS (2)
DATA MODELS (2)
INDIAN LANGUAGES (2)
VOICE CONVERSION (2)
ADAPTATION MODEL (1)
ANCHOR SPEAKER SEGMENT (1)
ANN (1)
APPROXIMATE MATCHING (1)
APPROXIMATE SYLLABLE MATCHING (1)
ASSISTIVE TECHNOLOGY (1)
AUDIO BOOKS (1)
AUDIO DATABASES (1)
AUDIO FILES (1)
AUDIO RECORDING (1)
AUTO ASSOCIATIVE NEURAL NETWORK (1)
AUTOMATIC SPEECH RECOGNITION TRANSCRIPT (1)
BANDWIDTH (1)
BASELINE TEXT SUMMARIZATION SYSTEM (1)
BAYES METHODS (1)
BAYESIAN INFORMATION CRITERION TECHNIQUE (1)
BOOKS (1)
BROADCAST NEWS (1)
BROADCAST NEWS SUMMARIZATION (1)
CMU ARCTIC (1)
COMPUTATIONAL LINGUISTICS (1)
COMPUTER ACCESS (1)
CONTEXTUAL GRAPHEME (1)
CONTEXTUAL GRAPHEMES (1)
CONVERSATIONAL SPEECH SYNTHESIS (1)
CURRENT MEASUREMENT (1)
DEGRADATION (1)
DICTIONARIES (1)
ENCODING (1)
ENGLISH LANGUAGE (1)
F0 GENERATION (1)
FEATURE VECTORS (1)
FONT IDENTIFICATION AND SCREEN READER (1)
FORCED-ALIGNMENT (1)
FOREIGN LANGUAGE COMPUTING (1)
FORMANTS (1)
FORWARD CONNECTED STATE MODEL (1)
FUNDAMENTAL FREQUENCY (1)
GAUSSIAN MIXTURE MODEL (1)
GAUSSIAN PROCESSES (1)
GLOBAL SYLLABLE SET (1)
GLYPH BASED FONT ENCODED TEXT (1)
GOLD (1)
GRAPHEME (1)
GRAPHEME BASED SPEECH SYNTHESIS (1)
HANDICAPPED AIDS (1)
HMM TOPOLOGIES (1)
HUMAN REFERENCE SUMMARIES (1)
HUMANS (1)
INFORMATIVE EXTRACTIVE AUDIO SUMMARIES (1)
INTELLIGIBLE SPEECH (1)
INTERNET (1)
INTERNET ACCESS (1)
LANGUAGE MODEL (1)
LARGE SPEECH FILES (1)
MATHEMATICAL MODEL (1)
MEL CEPSTRAL COEFFICIENTS (1)
MEL-CEPSTRAL DISTORTION SCORES (1)
MEMORY REQUIREMENTS (1)
MINORITY LANGUAGES (1)
MONOLOGUE SEGMENTATION (1)
MULTILINGUAL SCREEN READER (1)
MULTIMEDIA COMMUNICATION (1)
MULTIPLE LANGUAGES (1)
NATURAL LANGUAGE PROCESSING (1)
NATURAL LANGUAGES (1)
PATTERN CLASSIFICATION (1)
PHONE BOUNDARY (1)
PHONE RECOGNITION (1)
POLYGLOT SYNTHESIS (1)
PROBABILITY DENSITY FUNCTION (1)
PRONUNCIATION VARIATIONS (1)
PUBLIC DOMAIN (1)
RESTRICTED DICTIONARY (1)
SIGNAL PROCESSING (1)
SILENT SPEECH INTERFACE (1)
SOURCE SPEAKER (1)
SPEAKER RECOGNITION (1)
SPEAKER TRACKING (1)
SPEAKER VERIFICATION (1)
SPECTRAL ANALYSIS (1)
SPECTRAL FEATURE MAPPING (1)
SPEECH DATABASE (1)
more

INFONA - science communication portal

Search results for: K Prahallad

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options