Szukanie zaawansowane

Szukanie zaawansowane w ludziach

Od:

Do:

Pozycje od 81 do 100 spośród 2,601 wyników

Poprzednia

Następna

rozdział

Comparison of I-vector and GMM-UBM approaches to speaker identification with TIMIT and NIST 2008 databases in challenging environments

Musab T. S. Al-Kaltakchi, Wai L. Woo, Satnam S. Dlay, Jonathon A. Chambers

2017 25th European Signal Processing Conference (EUSIPCO) > 533 - 537

2017 25th European Signal Processing Conference (EUSIPCO)

In this paper, two models, the I-vector and the Gaussian Mixture Model-Universal Background Model (GMM-UBM), are compared for the speaker identification task. Four feature combinations of I-vectors with seven fusion techniques are considered: maximum, mean, weighted sum, cumulative, interleaving and concatenated for both two and four features. In addition, an Extreme Learning Machine (ELM) is exploited...

rozdział

Low resource point process models for keyword spotting using unsupervised online learning

Samik Sadhu, Prasanta Kumar Ghosh

2017 25th European Signal Processing Conference (EUSIPCO) > 538 - 542

2017 25th European Signal Processing Conference (EUSIPCO)

Point Process Models (PPM) have been widely used for keyword spotting applications. Training these models typically requires a considerable number of keyword examples. In this work, we consider a scenario where very few keyword examples are available for training. The availability of a limited number of training examples results in a PPM with poorly learnt parameters. We propose an unsupervised online...

rozdział

Speaker verification anti-spoofing using linear prediction residual phase features

Cemal Hanilci

2017 25th European Signal Processing Conference (EUSIPCO) > 96 - 100

2017 25th European Signal Processing Conference (EUSIPCO)

The vulnerability of automatic speaker verification (ASV) systems against spoofing attacks is an important security concern about the reliability of ASV technology. Recently, various countermeasures have been developed for spoofing detection. In this paper, we propose to use features derived from linear prediction (LP) residual signal for spoofing detection using simple Gaussian mixture model (GMM)...

rozdział

Development of multilingual phone recognition system for Indian languages

K E Manjunath, K. Sreenivasa Rao, Dinesh Babu Jayagopi

2017 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES) > 1 - 6

2017 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES)

In this paper, the development of Multilingual Phone Recognition System (MPRS) in the context of Indian languages is described. MPRS is a language independent Phone Recognition System (PRS) that could recognise the phonetic units present in a speech utterance of any language. We have developed two Bilingual and a quadrilingual PRS using four Indian languages — Kannada, Telugu, Bengali, and Odia. International...

rozdział

Review of Chinese entity relation extraction

Wang Zirui, Miao Fang, Jin Libiao

2017 3rd IEEE International Conference on Control Science and Systems Engineering (ICCSSE) > 633 - 637

2017 3rd IEEE International Conference on Control Science and Systems Engineering (ICCSSE)

With the rapid development of Internet, how to obtain valuable information from massive messages has become a major problem we need to be solved in the information explosive era. This paper introduces the development route of information extraction technology, and discusses four categories of Chinese entity relation extraction technologies in depth. Finally, the advantages and disadvantages of different...

rozdział

A perceptually-weighted deep neural network for monaural speech enhancement in various background noise conditions

Qingju Liu, Wenwu Wang, Philip J B Jackson, Yan Tang

2017 25th European Signal Processing Conference (EUSIPCO) > 1270 - 1274

2017 25th European Signal Processing Conference (EUSIPCO)

Deep neural networks (DNN) have recently been shown to give state-of-the-art performance in monaural speech enhancement. However in the DNN training process, the perceptual difference between different components of the DNN output is not fully exploited, where equal importance is often assumed. To address this limitation, we have proposed a new perceptually-weighted objective function within a feedforward...

rozdział

Blind Source Separation and Identification for Speech Signals

Jie Yin, Zhiliang Liu, Yaqiang Jin, Dandan Peng, więcej

2017 International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC) > 398 - 402

2017 International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC)

Background noise reduction has been studied for many years. However, unwanted human speech noise suppression is not well discussed due to sparsity of the speech signal. Traditional blind source separation (BSS) methods such as independent component analysis (ICA) assume the prior knowledge of the number of sources and require that the number of sources must equal the number of sensors. Above limitations...

rozdział

The impact of vocabulary size and language model order on the polish whispery speech recognition

Piotr Kozierski, Talar Sadalla, Szymon Drgas, Adam Dabrowski, więcej

2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR) > 616 - 621

2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR)

The article presents studies on the automatic whispery speech recognition. In the performed research a new corpus with whispery speech has been used. The aim of studies presented in this paper was to check, how the vocabulary size and the language model order influence on the speech recognition quality. It has been concluded that even using recordings with 5,000 different words only it is possible...

artykuł

Restricted Boltzmann Machine-Based Voice Conversion for Nonparallel Corpus

Ki-Seung Lee

IEEE Signal Processing Letters > 2017 > 24 > 8 > 1103 - 1107

A large amount of parallel training corpus is necessary for robust, high-quality voice conversion. However, such parallel data may not always be available. This letter presents a new voice conversion method that needs no parallel speech corpus, and adopts a restricted Boltzmann machine (RBM) to represent the distribution of the spectral features derived from a target speaker. A linear transformation...

rozdział

Speaker recognition based on MFCC and BP neural networks

Yi Wang, Bob Lawlor

2017 28th Irish Signals and Systems Conference (ISSC) > 1 - 4

2017 28th Irish Signals and Systems Conference (ISSC)

Speaker recognition has been developed over many years and it comes with many different methods. MFCC is one of more the successful methods due to it being generally modeled on the human auditory system. It represents high success rate of recognition and strong robustness against noise in the lower frequency regions. However, in the higher frequency regions, it captures speaker characteristics information...

rozdział

Voice conversion based on continuous frequency warping and magnitude scaling

Yuhang Ye, Bob Lawlor

2017 28th Irish Signals and Systems Conference (ISSC) > 1 - 6

2017 28th Irish Signals and Systems Conference (ISSC)

In this paper, we present a novel spectrum mapping method — Continuous Frequency Warping and Magnitude Scaling (CFWMS) for voice conversion under the Joint Density Gaussian Mixture Model (JDGMM) framework. JDGMM is a mature clustering technique that models the joint probability density of speech signals from paired speakers. The conventional JDGMM-based approaches morph the spectral features via least...

rozdział

Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging

Pawel Rosciszewski, Jakub Kaliski

2017 International Conference on High Performance Computing & Simulation (HPCS) > 560 - 565

2017 International Conference on High Performance Computing & Simulation (HPCS)

In the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modification of the training...

rozdział

Variable sparsity regularization factor based SNMF for monaural speech separation

Yash Vardhan Varshney, Zia Ahmad Abbasi, Musiur Raza Abidi, Omar Farooq

2017 40th International Conference on Telecommunications and Signal Processing (TSP) > 342 - 345

2017 40th International Conference on Telecommunications and Signal Processing (TSP)

Factor of sparsity in a speech signal plays an important role in the speech processing. This paper proposed a method in which variable regularization factor of sparsity is applied for the mixed signal and used to separate the monaural speech signals. The sparsity regularization factor for individual training and testing signal was find using particle swarm optimization. Algorithm has been tested for...

rozdział

Improvement of maintenance through speech interaction in cyber-physical production systems

J. Fischer, D. Pantforder, B. Vogel-Heuser

2017 IEEE 15th International Conference on Industrial Informatics (INDIN) > 290 - 295

2017 IEEE 15th International Conference on Industrial Informatics (INDIN)

A much discussed topic in the recent years is the interconnectedness of industrial plants in the field of Cyber-Physical Production Systems (CPPS). In the future, the data and aggregated information from various production plants will be available globally at any time. Particularly in maintenance, this could be a helpful information expansion for the maintenance staff, since maintenance information...

rozdział

HMM based isolated word Nepali speech recognition

Manish K. Ssarma, Avaas Gajurel, Anup Pokhrel, Basanta Joshi

2017 International Conference on Machine Learning and Cybernetics (ICMLC) > 1 > 71 - 76

2017 International Conference on Machine Learning and Cybernetics (ICMLC)

This paper describes the implementation of HMM (Hidden Markov Model) based speaker independent isolated word Automatic Speech Recognition (ASR) system for Nepali Language, a commonly spoken language in Nepal. The system has been developed in python using numpy[1] and YAHMM[2] libraries. The system is trained in different Nepali words by collecting data from different speakers in room environment....

rozdział

Automatic speech recognition performance for training on noised speech

Arkadiy Prodeus, Kateryna Kukharicheva

2017 2nd International Conference on Advanced Information and Communication Technologies (AICT) > 71 - 74

2017 2nd International Conference on Advanced Information and Communication Technologies (AICT)

Performances of some training techniques of automatic speech recognition system are compared in this paper. Speech recognition accuracy was used as measure of performance. Different kinds of outdoor and indoor noise were used for studying. It is shown the superiority of training on noised speech methods over the competitive technique of training on clear speech. It has been found that training by...

rozdział

Multi-scale feature based convolutional neural networks for large vocabulary speech recognition

Tong Fu, Xihong Wu

2017 IEEE International Conference on Multimedia and Expo (ICME) > 1093 - 1098

2017 IEEE International Conference on Multimedia and Expo (ICME)

Deep learning has brought a breakthrough to the performance of speech recognition. The speech recognition systems based on deep neural networks have obtained the state-of-the-art performance on various speech recognition tasks. These systems almost utilize the Mel-frequency cepstral coefficients or the Mel-scale log-filterbank coefficients, which are based on short-time Fourier transform. Although...

rozdział

Text-to-speech of a talking robot for interactive speech training of hearing impaired

Thanh Vo Nhu, Hideyuki Sawada

2017 10th International Conference on Human System Interactions (HSI) > 166 - 171

2017 10th International Conference on Human-System Interactions (HSI)

The authors are developing a talking robot which is a mechanical vocalization system modeling the human articulatory system. The talking robot is constructed with mechanical parts that are made by referring to human vocal organs biologically and functionally. In this study, a newly redesign artificial vocal cord is developed for the purpose of extending the speaking capability of the talking robot...

rozdział

Single-channel speech separation based on robust sparse Bayesian learning

Zhe Wang, Guoan Bi, Xiumei Li

2017 13th IEEE International Conference on Control & Automation (ICCA) > 113 - 117

2017 13th IEEE International Conference on Control & Automation (ICCA)

This paper describes a novel algorithm to improve the performance of sparsity based single-channel speech separation(SCSS) problem based on compressed sensing which is an emerging technique for efficient data reconstruction. The conventional approach assumes the mixing conditions and source signals are stationary. For practical applications of audio source separation, however, we face the challenges...

rozdział

Privacy-Preserving Understanding of Human Body Orientation for Smart Meetings

Indrani Bhattacharya, Noam Eshed, Richard J. Radke

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 284 - 292

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

We present a method for estimating the body orientation of seated people in a smart room by fusing low-resolution range information collected from downward pointed time-of-flight (ToF) sensors with synchronized speaker identification information from microphone recordings. The ToF sensors preserve the privacy of the occupants in that they only return the range to a small set of hit points. We propose...

Poprzednia

Następna

Opcje filtrowania

Słowa kluczowe:
TRAINING
SPEECH

Data publikacji

Ustaw własny zakres dat

Dostępność treści

Dostępna (2,591)
Brak (10)

Typ publikacji

książka (2,284)
artykuł (317)

Słowa kluczowe

SPEECH RECOGNITION (1,174)
HIDDEN MARKOV MODELS (1,022)
FEATURE EXTRACTION (722)
ACOUSTICS (566)
SPEECH PROCESSING (411)
SPEAKER RECOGNITION (314)
DATABASES (313)
MEL FREQUENCY CEPSTRAL COEFFICIENT (269)
SUPPORT VECTOR MACHINES (264)
ACCURACY (248)
DATA MODELS (219)
SPEECH SYNTHESIS (192)
TRAINING DATA (190)
NEURAL NETWORKS (185)
COMPUTATIONAL MODELING (182)
TESTING (182)
ARTIFICIAL NEURAL NETWORKS (177)
VECTORS (165)
NOISE MEASUREMENT (162)
NATURAL LANGUAGE PROCESSING (159)
NOISE (156)
ADAPTATION MODELS (153)
DATA MINING (146)
AUTOMATIC SPEECH RECOGNITION (131)
EMOTION RECOGNITION (128)
SIGNAL TO NOISE RATIO (127)
MATHEMATICAL MODEL (114)
SPEECH ENHANCEMENT (114)
ADAPTATION MODEL (113)
HIDDEN MARKOV MODEL (113)
GAUSSIAN PROCESSES (107)
KERNEL (101)
CONTEXT (100)
DECODING (97)
ROBUSTNESS (96)
ESTIMATION (95)
CLASSIFICATION ALGORITHMS (93)
GAUSSIAN MIXTURE MODEL (87)
LEARNING (ARTIFICIAL INTELLIGENCE) (86)
NIST (84)
SPEAKER VERIFICATION (82)
HMM (81)
DICTIONARIES (79)
VOCABULARY (79)
CEPSTRAL ANALYSIS (73)
MAXIMUM LIKELIHOOD ESTIMATION (72)
MACHINE LEARNING (71)
MFCC (71)
DEEP NEURAL NETWORKS (70)
CORRELATION (69)
SPEECH CODING (69)
MICROPHONES (66)
ERROR ANALYSIS (65)
STATISTICAL ANALYSIS (65)
SPEAKER IDENTIFICATION (64)
TRANSFORMS (64)
PATTERN CLASSIFICATION (63)
OPTIMIZATION (62)
SPECTROGRAM (61)
VISUALIZATION (60)
ALGORITHM DESIGN AND ANALYSIS (59)
NEURAL NETS (59)
DEEP NEURAL NETWORK (58)
SUPPORT VECTOR MACHINE (58)
VOICE CONVERSION (58)
STANDARDS (55)
CLUSTERING ALGORITHMS (52)
CONTEXT MODELING (52)
TEXT ANALYSIS (50)
GMM (48)
HUMANS (48)
NATURAL LANGUAGES (48)
RECURRENT NEURAL NETWORKS (48)
DISCRIMINATIVE TRAINING (47)
SVM (47)
VECTOR QUANTIZATION (47)
NEURONS (46)
PREDICTIVE MODELS (46)
PROBABILITY (45)
ACOUSTIC SIGNAL PROCESSING (43)
CONFERENCES (43)
SIGNAL PROCESSING ALGORITHMS (43)
JOINTS (42)
EDUCATIONAL INSTITUTIONS (41)
PRINCIPAL COMPONENT ANALYSIS (41)
REVERBERATION (41)
SIGNAL CLASSIFICATION (40)
SIGNAL PROCESSING (40)
ENTROPY (39)
NEURAL NETWORK (39)
SUPPORT VECTOR MACHINE CLASSIFICATION (39)
TRAJECTORY (39)
DEEP LEARNING (37)
I-VECTOR (37)
LATTICES (37)
AUDITORY SYSTEM (36)
DECISION TREES (36)
DETECTORS (36)
więcej

Zbiór danych

ieee (2,594)
Elsevier (4)
CEJSH (1)
Springer (1)
Wiley (1)

INFONA - portal komunikacji naukowej

Szukanie zaawansowane