Wyniki wyszukiwania

rozdział

Performance estimation of spontaneous speech recognition using non-reference acoustic features

Ling Guo, Takeshi Yamada, Shoji Makino

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 4

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

To ensure a satisfactory QoE (Quality of Experience), it is essential to establish a method that can be used to efficiently investigate recognition performance for spontaneous speech. By using this method, it is allowed to monitor the recognition performance in providing speech recognition services. It can be also used as a reliability measure in speech dialogue systems. Previously, methods for estimating...

rozdział

An initial research: Towards accurate pitch extraction for speech synthesis based on BLSTM

Yibin Zheng, Zhengqi Wen, Bin Liu, Ya Li, więcej

2016 IEEE 13th International Conference on Signal Processing (ICSP) > 165 - 170

2016 IEEE 13th International Conference on Signal Processing (ICSP)

Accurate pitch extraction from speech is important but challenging problem for speech synthesis. However, the additive nature and long-term suprasegmental property of pitch features have not been fully exploited in most of the existing pitch estimators as they are operated frame by frame. As a result, they would cause some inherent discontinuities, such as double/half F0 errors and unvoiced/voiced...

rozdział

Robust multiple sound source localization in noisy environment by using a soundfield microphone

Jundai Sun, Maoshen Jia, Changchun Bao

2016 IEEE 13th International Conference on Signal Processing (ICSP) > 545 - 550

2016 IEEE 13th International Conference on Signal Processing (ICSP)

Sound source localization techniques are becoming popular as they provide an effective information for parameter coding and reconstruction of sound scene. A recent approach based on “single-source” zone detecting was proposed. However, the method is not robust in noisy environment due to its DOA estimation principle. To overcome this issue, a mixture enhancement processing based multiple sound source...

rozdział

Adequacy analysis of autoregressive model for Lithuanian semivowels

G. Tamulevicius, J. Kaukenas

2016 IEEE 4th Workshop on Advances in Information, Electronic and Electrical Engineering (AIEEE) > 1 - 4

2016 IEEE 4th Workshop on Advances in Information, Electronic and Electrical Engineering (AIEEE)

Autoregressive model order and parameter estimation technique is proposed and applied for modeling of Lithuanian semivowels. According to experimental results adequate modeling of semivowels requires for model order 72 in average. The appropriate order value differs for female and male voices. Besides, there are remarkable differencies between word starting and middle phones - the last ones are influenced...

rozdział

Evaluation of noise estimation algorithms based on minimum statistics and signal to noise ratio

Niksa Jakovljevic, Dragisa Miskovic, Zeljen Trpovski

2016 24th Telecommunications Forum (TELFOR) > 1 - 4

2016 24th Telecommunications Forum (TELFOR)

The paper reports on the objective evaluation and comparison of the two noise estimation algorithms for noisy speech signals. Both algorithms are based on observation that local minima in noisy speech spectrogram are close to the power level of the noise signal. The first algorithm directly searches spectrogram for the local minima and those values use to update noise power spectrum density (psd)...

rozdział

VTS feature compensation based on two-layer GMM structure for robust speech recognition

Lin Zhou, Haijing Li, Ying Chen, Zhenyang Wu, więcej

2016 8th International Conference on Wireless Communications & Signal Processing (WCSP) > 1 - 5

2016 8th International Conference on Wireless Communications & Signal Processing (WCSP)

In this paper, a two-layer Gaussian Mixed Model (GMM) structure for Vector Taylor Series (VTS) feature compensation is proposed for robust speech recognition. Since GMM with the numerous mixture components is used for VTS, the computation complexity of VTS is extremely huge. To deal with this issue, we propose two-layer GMM structure for VTS. In detail, the GMM with fewer mixture components is utilized...

rozdział

Experimental study on noise pre-processing for a low bit rate speech coder

Wenhua Shi, Xiongwei Zhang, Xia Zou, Xiaodong Song

2016 8th International Conference on Wireless Communications & Signal Processing (WCSP) > 1 - 5

2016 8th International Conference on Wireless Communications & Signal Processing (WCSP)

This paper focuses on the quality of speech coding parameters extraction under noisy and clean conditions. The influence of speech enhancement on the quality of extracted parameters for a low bit rate speech coder is addressed. MELP vocoder is used to estimate three parameters: the fundamental frequency, voicing and linear prediction coefficients. De-noising methods in MELPe vocoder and SMV are adopted...

rozdział

Speech enhancement using combination of digital audio effects with Kalman filter

G. Manmadha Rao, Ummidala Santosh Kumar

2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES) > 1208 - 1211

2016 International conference on Signal Processing, Communication, Power and Embedded System (SCOPES)

The term “Quality of Speech” in Speech Enhancement techniques is associated with Clarity and Intelligibility. Till now due to the variable nature and characteristics of noise with time and process to process, Speech Enhancement is a difficult problem in Noisy environment. In this paper, we proposed a method to improve the quality of speech based on combination of Digital Audio Effects with Improved...

rozdział

Single channel speech segregation using cepstrum method

Kukku Merin Skariah, M. S. Lekshmi

2016 International Conference on Emerging Technological Trends (ICETT) > 1 - 5

2016 International Conference on Emerging Technological Trends (ICETT)

In natural environment speech signal is affected by various acoustic interference. Many of the applications in audio signal processing such as automatic speech recognition, telecommunications and hearing aid applications etc. requires an effective way of segregating the target speech from the mixed speech. Pitch information has an important role in the field of audio signal processing, especially...

rozdział

Robust speaker identification under noisy conditions using feature compensation and signal to noise ratio estimation

Megan N. Frankle, Ravi P. Ramachandran

2016 IEEE 59th International Midwest Symposium on Circuits and Systems (MWSCAS) > 1 - 4

2016 IEEE 59th International Midwest Symposium on Circuits and Systems (MWSCAS)

For wireless remote access security, forensics, electronic commerce and surveillance applications, there is a growing need for biometric speaker identification systems to be robust to noise. This paper examines the robustness issue for the case of additive white noise at signal to noise ratios ranging from 0 to 30 dB. A Gaussian mixture model classifier based on adaptation of a universal background...

rozdział

F₀ estimation of speech based on IRAPT using WLP-based TV-CAR analysis

Wei Shan, Keiichi Funaki

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 4

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

Fundamental frequency (F₀) estimation plays an important role in speech processing such as speech coding, synthesis, recognition and so on. Although a present F0 estimation method performs well under clean condition, the performance deteriorates significantly in noisy environment. For this reason robust F₀ estimation against additive noise is demanded. We have previously proposed F₀ estimation methods...

rozdział

The effect of gain thresholds on speech intelligibility for statistical model based noise reduction for cochlear implants: A simulation based verification

Wenzhi He, Nengheng Zheng, Qinglin Meng

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 4

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

Noise corruption can dramatically decrease the speech intelligibility for listeners with cochlear implants (CI). Noise reduction is a key point in CI speech processing strategy. This paper proposes a statistical model based noise reduction algorithm for-CIs. A realistic noise estimator, which requires no prior knowledge of the noise, is adopted for noise estimation. An improved method for determining...

rozdział

Speech enhancement based on nonparametric factor analysis

Lin Li, Jiawen Wu, Xinghao Ding, Qingyang Hong, więcej

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

A new speech enhancement strategy is proposed by utilizing a Bayesian nonparametric method of beta process factor analysis. As a sparse representation frame work, the dictionary learning, sparse coefficients representation and noise variance estimation are integrated into a joint procedure of Bayesian posterior estimation. The beta process is adopted as a sparse prior to infer the sparsity of the...

rozdział

Two methods for estimating noise amplitude spectral in non-stationary environments

Shifeng Ou, Wei Liu, Suojin Shen, Ying Gao

2016 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI) > 969 - 973

2016 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)

Estimating the amplitude spectral of noise signal is a very important part in many noise reduction systems. The conventional voice activity detection (VAD)-based method updates the amplitude spectral estimate only in speech absence areas and fails to deal with non-stationary noise. To overcome this problem, this paper proposes two methods to estimate the noise amplitude spectral for non-stationary...

rozdział

Analysis of the dependencies between parameters of the voice at the context of the succession of sung vowels

Edward Polrolniczak, Michal Kramarczyk

2016 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA) > 72 - 77

2016 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)

The article presents the results of signal analysis of the recorded singing voice samples. For that study the recorded samples of the “a-e-i-o-u” exercise is analysed. Some significant parameters describing voice have been estimated. Among the estimated parameters are: pitch, calculated with the use of autocorrelation method, values of the first five harmonics, set of parameters containing first five...

rozdział

A contingency multi-microphone noise reduction strategy based on linearly constrained Multi-channel Wiener filtering

Randall Ali, Marc Moonen

2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC) > 1 - 4

2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC)

The Minimum Variance Distortionless Response (MVDR) beam-former is a popular multi-microphone noise reduction and speech enhancement strategy that can be implemented either as a fixed-constraint MVDR beamformer, with a pre-defined Relative Transfer Function (RTF) or based on a Multi-channel Wiener Filter (MWF) estimate. However, each implementation is not fully robust within a dynamic acoustic environment...

rozdział

Dual-microphone phase-difference-based SNR estimation with applications to speech enhancement

Frederic Mustiere, Renato Nakagawa, Kamil Wojcicki, Ivo Merks, więcej

2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC) > 1 - 5

2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC)

This paper introduces novel two-channel a priori Signal-to-Noise Ratio (SNR) estimators for use in frequency-domain speech enhancement algorithms. The SNR estimation is based on statistics of the noisy phase difference between two microphones in each frequency bin. Namely, the corresponding probability distribution is derived assuming a complex Gaussian model, and is written in terms of the SNR only...

rozdział

Modeling audio directional statistics using a probabilistic spatial dictionary for speaker diarization in real meetings

Mahmoud Fakhry, Nobutaka Ito, Shoko Araki, Tomohiro Nakatani

2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC) > 1 - 5

2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC)

Speaker diarization is the task of estimating “who spoke when” in a meeting. To realize accurate diarization for real meetings, we have to deal with noise, speaker overlap, reverberation, etc. In this work, we propose to model directional statistics of spatial clusters via a dictionary of probabilistic models. The dictionary is trained using spatial features of possible source locations. Observed...

rozdział

Artificial bandwidth extension using deep neural networks for spectral envelope estimation

Johannes Abel, Maximilian Strake, Tim Fingscheidt

2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC) > 1 - 5

2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC)

Many artificial speech bandwidth extension (ABE) approaches perform source-filter decomposition of the input narrowband speech, with subsequent computation of upper frequency band (UB) spectral envelope posteriors. In this paper we perform a direct comparison of HMM- and deep neural network (DNN)-based modeling of likelihoods or posteriors for ABE UB envelope estimation. DNN-based approaches turn...

rozdział

A real-time noise energy estimation method

Wei Yaodu, Liu Li, Wang Lizhong

2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC) > 1 - 4

2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC)

Noise energy estimation is widely used as a pre-process in speech enhancement and speech recognition systems. While many signal processing algorithms have been proposed to estimate the additive noise energy, they are generally based on some statistical hypothesis and have high computation complexity, which is crucial in mobile devices. When the hypothesis does not hold, the estimation performance...

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania

Performance estimation of spontaneous speech recognition using non-reference acoustic features

An initial research: Towards accurate pitch extraction for speech synthesis based on BLSTM

Robust multiple sound source localization in noisy environment by using a soundfield microphone

Adequacy analysis of autoregressive model for Lithuanian semivowels

Evaluation of noise estimation algorithms based on minimum statistics and signal to noise ratio

VTS feature compensation based on two-layer GMM structure for robust speech recognition

Experimental study on noise pre-processing for a low bit rate speech coder

Speech enhancement using combination of digital audio effects with Kalman filter

Single channel speech segregation using cepstrum method

Robust speaker identification under noisy conditions using feature compensation and signal to noise ratio estimation

F₀ estimation of speech based on IRAPT using WLP-based TV-CAR analysis

The effect of gain thresholds on speech intelligibility for statistical model based noise reduction for cochlear implants: A simulation based verification

Speech enhancement based on nonparametric factor analysis

Two methods for estimating noise amplitude spectral in non-stationary environments

Analysis of the dependencies between parameters of the voice at the context of the succession of sung vowels

A contingency multi-microphone noise reduction strategy based on linearly constrained Multi-channel Wiener filtering

Dual-microphone phase-difference-based SNR estimation with applications to speech enhancement

Modeling audio directional statistics using a probabilistic spatial dictionary for speaker diarization in real meetings

Artificial bandwidth extension using deep neural networks for spectral envelope estimation

A real-time noise energy estimation method

Opcje filtrowania

Data publikacji

Dostępność treści

Słowa kluczowe

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Dostępność treści

Słowa kluczowe

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu