Search results

chapter

Speaker extraction using LCMV beamformer with DNN-based SPP and RTF identification scheme

Ariel Malek, Shlomo E. Chazan, Ilan Malka, Vladimir Tourbabin, more

2017 25th European Signal Processing Conference (EUSIPCO) > 2274 - 2278

2017 25th European Signal Processing Conference (EUSIPCO)

The linearly constrained minimum variance (LCMV)-beamformer (BF) is a viable solution for desired source extraction from a mixture of speakers in a noisy environment. The performance in terms of speech distortion, interference cancellation and noise reduction depends on the estimation of a set of parameters. This paper presents a new mechanism to update the parameters of the LCMV-BF. A new speech...

chapter

Data-driven and physical model-based designs of probabilistic spatial dictionary for online meeting diarization and adaptive beamforming

Nobutaka Ito, Shoko Araki, Tomohiro Nakatani

2017 25th European Signal Processing Conference (EUSIPCO) > 1165 - 1169

2017 25th European Signal Processing Conference (EUSIPCO)

In this paper, we comparatively study alternative dictionary designs for recently proposed meeting diarization and adaptive beamforming based on a probabilistic spatial dictionary. This dictionary models the feature distribution for each possible direction of arrival (DOA) of speech signals and the feature distribution for background noise. The dictionary enables online DOA detection, which in turn...

chapter

Successive relative transfer function identification using single microphone speech enhancement

Dani Cherkassky, Shlomo E. Chazan, Jacob Goldberger, Sharon Gannot

2017 25th European Signal Processing Conference (EUSIPCO) > 1235 - 1239

2017 25th European Signal Processing Conference (EUSIPCO)

A distortionless speech extraction in a reverberant environment can be achieved by an application of a beamforming algorithm, provided that the relative transfer functions (RTFs) of the sources and the covariance matrix of the noise are known. In this contribution, we consider the RTF identification challenge in a multi-source scenario. We propose a successive RTF identification (SRI), based on a...

chapter

Multiple DOA estimation based on estimation consistency and spherical harmonic multiple signal classification

Sina Hafezi, Alastair H. Moore, Patrick A. Naylor

2017 25th European Signal Processing Conference (EUSIPCO) > 1240 - 1244

2017 25th European Signal Processing Conference (EUSIPCO)

A common approach to multiple Direction-of-Arrival (DOA) estimation of speech sources is to identify Time-Frequency (TF) bins with dominant Single Source (SS) and apply DOA estimation such as Multiple Signal Classification (MUSIC) only on those TF bins. In the state-of-the-art Direct Path Dominance (DPD)-MUSIC, the covariance matrix, used as the input to MUSIC, is calculated using only the TF bins...

chapter

Experimental analysis of optimal window length for independent low-rank matrix analysis

Daichi Kitamura, Nobutaka Ono, Hiroshi Saruwatari

2017 25th European Signal Processing Conference (EUSIPCO) > 1170 - 1174

2017 25th European Signal Processing Conference (EUSIPCO)

In this paper, we address the blind source separation (BSS) problem and analyze the optimal window length in the short-time Fourier transform (STFT) for independent low-rank matrix analysis (ILRMA). ILRMA is a state-of-the-art BSS technique that utilizes the statistical independence between low-rank matrix spectrogram models, which are estimated by nonnegative matrix factorization. In conventional...

chapter

A variational EM method for pole-zero modeling of speech with mixed block sparse and Gaussian excitation

Liming Shi, Jesper Kjar Nielsen, Jesper Rindom Jensen, Mads Grosboll Christensen

2017 25th European Signal Processing Conference (EUSIPCO) > 1784 - 1788

2017 25th European Signal Processing Conference (EUSIPCO)

The modeling of speech can be used for speech synthesis and speech recognition. We present a speech analysis method based on pole-zero modeling of speech with mixed block sparse and Gaussian excitation. By using a pole-zero model, instead of the all-pole model, a better spectral fitting can be expected. Moreover, motivated by the block sparse glottal flow excitation during voiced speech and the white...

chapter

Multi-way regression for age prediction exploiting speech and face image information

Evangelia Pantraki, Constantine Kotropoulos

2017 25th European Signal Processing Conference (EUSIPCO) > 2196 - 2200

2017 25th European Signal Processing Conference (EUSIPCO)

In this paper, the problem of age estimation is addressed based on two modalities: speech utterances and speakers' face images. The proposed age estimation framework employs the Shifted Covariates REgression Analysis for Multi-way data (SCREAM) model, which combines Parallel Factor Analysis 2 and Principal Covariates Regression. SCREAM is able to extract a few latent variables from multi-way data...

chapter

A novel filterbank for epoch estimation

Pramod Bachhav, Hemant A. Fatil

2017 25th European Signal Processing Conference (EUSIPCO) > 1624 - 1628

2017 25th European Signal Processing Conference (EUSIPCO)

We present a novel approach for epoch estimation from the simple observation of the speech spectrum. Fundamental frequency (F₀) of the speech signal and local variations around F₀ are the characteristics of glottal excitation source. Extraction of this information from the speech spectrum can be used to estimate epochs (since higher harmonics interact with the vocal tract characteristics, they no...

chapter

Modeling formant dynamics in speech spectral envelopes

Alexandra Craciun, Jouni Paulus, Gokhan Sevkin, Tom Backstrom

2017 25th European Signal Processing Conference (EUSIPCO) > 1619 - 1623

2017 25th European Signal Processing Conference (EUSIPCO)

The spectral envelope of a speech signal encodes information about the characteristics of the speech source. As a result, spectral envelope modeling is a central task in speech applications, where tracking temporal transitions in diphones and triphones is essential for efficient speech synthesis and recognition algorithms. Temporal changes in the envelope structure are often derived from estimated...

chapter

Disordered speech quality estimation using linear prediction

Yousef S Ettomi Ali, Vijay Parsa, Phillip Doyle, Soulaimane Berkane

2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM) > 1 - 5

2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM)

Tracheoesophageal (TE) speech is generated by patients who have undergone a total laryngectomy where the larynx (voice box) is removed and replaced by a tracheoesophageal puncture. This work presents a novel low complexity algorithm to estimate the degree of severity of disordered TE speech. The proposed algorithm uses features which are computed from 32-ms voiced frames of the speech signal. A 21-st...

chapter

Speech intelligibility and quality: A comparative study of speech enhancement algorithms

Michael Russell, Ronan Flynn, Xiaodong Xu

2017 28th Irish Signals and Systems Conference (ISSC) > 1 - 6

2017 28th Irish Signals and Systems Conference (ISSC)

Mobile devices are widely used today for speech communication. The environments in which these devices are used are widely varied and often the level of background noise in the speaker's environment can be significant. The purpose of speech enhancement is to reduce the level of background noise, ideally to such a level that it is not noticed by the listener. While speech enhancement algorithms can...

chapter

Privacy-Preserving Understanding of Human Body Orientation for Smart Meetings

Indrani Bhattacharya, Noam Eshed, Richard J. Radke

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 284 - 292

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

We present a method for estimating the body orientation of seated people in a smart room by fusing low-resolution range information collected from downward pointed time-of-flight (ToF) sensors with synchronized speaker identification information from microphone recordings. The ToF sensors preserve the privacy of the occupants in that they only return the range to a small set of hit points. We propose...

chapter

Phylogeny analysis for MP3 and AAC coding transformations

Milica Maksimovic, Luca Cuccovillo, Patrick Aichroth

2017 IEEE International Conference on Multimedia and Expo (ICME) > 1165 - 1170

2017 IEEE International Conference on Multimedia and Expo (ICME)

The following paper presents our work on audio phylogeny with a focus on two application scenarios: audiovisual (A/V) archives and tampering detection. Starting from a set of near-duplicate audio files, our goal is to determine the processing history for the set, and detect the transformations that have been applied on each linked pair of nodes. Our approach targets AAC and MP3 encoding operations...

chapter

Articulation acoustic kinematics in ALS speech

P. P.Gomez, D. Palacios, A. Gomez, V. Rodellar, more

2017 International Conference and Workshop on Bioinspired Intelligence (IWOBI) > 1 - 6

2017 International Conference and Workshop on Bioinspired Intelligence (IWOBI)

Patients affected by Amyotrophic Lateral Sclerosis (ALS) show specific dysarthric clues in speech. These marks could be used to detect early symptoms and monitor the evolution of the disease in time. Classically articulation marks have been mainly based on static premises. Articulation Kinematics from acoustic correlates may help in producing measurements based on the dynamic behavior of speech. Specifically,...

chapter

Modernization of formant method of estimation of voice information protection from leakage through technical channels

V. A. Trushin, I. L. Reva, A. V. Ivanov

2017 International Siberian Conference on Control and Communications (SIBCON) > 1 - 5

2017 International Siberian Conference on Control and Communications (SIBCON)

This study is summary of research results of existing voice information security estimation approach analysis and modification, especially by change of immediate appreciation test conditions, along with considering speech forcing effect, adjustment of frequency range width and the method of its division, analysis of amplitude speech constitution and qualification of test signal level. Another point...

chapter

Robust speaker verification with a two classifier format and feature enhancement

Joshua S. Edwards, Ravi P. Ramachandran, Umashanger Thayasivam

2017 IEEE International Symposium on Circuits and Systems (ISCAS) > 1 - 4

2017 IEEE International Symposium on Circuits and Systems (ISCAS)

In the presence of environmental noise, speaker verification systems inevitably see a decrease in performance. This paper proposes the (1) use of two parallel classifiers, (2) feature enhancement based on blind signal-to-noise ratio (SNR) estimation and (3) fusion, to improve the performance of speaker verification systems. The two classifiers are based on Gaussian mixture models and the partial least-squares...

chapter

Late reverberation reduction and blind reverberation time measurement for automatic speech recognition

Arkadiy Prodeus

2017 IEEE First Ukraine Conference on Electrical and Computer Engineering (UKRCON) > 634 - 639

2017 IEEE First Ukraine Conference on Electrical and Computer Engineering (UKRCON)

Development of automatic speech recognition (ASR) systems robust to late reverberation action is urgent task. It is well known that a late reverberation reduction algorithm used as ASR pre-processor demands prior estimation of reverberation time. Blind reverberation time measurements are less accurate than ones for known room impulse response (RIR) direct measurements. As result, it is naturally expect...

chapter

Modified Wiener filtering speech enhancement algorithm with phase spectrum compensation

Zhang Wenlu, Peng Hua

2017 IEEE 9th International Conference on Communication Software and Networks (ICCSN) > 1075 - 1079

2017 IEEE 9th International Conference on Communication Software and Networks (ICCSN)

In this paper, a modified Wiener filtering speech enhancement algorithm with phase spectrum compensation is proposed, which aims at improving performance of typical Wiener filtering speech enhancement algorithm in low signal-noise ratio. Since typical speech enhancement algorithms always used the observed noisy speech phase spectrum unchanged directly as enhanced speech phase spectrum, and estimated...

chapter

Auditory mask estimation by RPCA for monaural speech enhancement

Wenhua Shi, Xiongwei Zhang, Xia Zou, Wei Han, more

2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS) > 179 - 184

2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS)

Mask estimation has shown a IoT of promise in speech enhancement for its simplicity and large speech intelligibility improvement. In this paper, the gammachirp filter banks are applied on the contaminated speech signal to get the auditory time-frequency representation. Robust principal component analysis with non-negative constraint is employed to decompose the auditory time-frequency representation...

chapter

Disordered Speech Quality estimation using the Matching Pursuit algorithm

Yousef S Ettomi Ali, Vijay Parsa, Philip Doyle, Soulaimane Berkane

2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE) > 1 - 5

2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE)

This paper proposes a novel non-intrusive auditory perception-based approach for disordered speech quality estimation. An adaptive time-frequency algorithm, viz. the Matching Pursuit (MP) algorithm, is used to generate a reference signal from the disordered speech signal. Both the generated reference signal and the original degraded signal are given to the International Telecommunication Union (ITU)-standardized...

INFONA - science communication portal

Search results

Speaker extraction using LCMV beamformer with DNN-based SPP and RTF identification scheme

Data-driven and physical model-based designs of probabilistic spatial dictionary for online meeting diarization and adaptive beamforming

Successive relative transfer function identification using single microphone speech enhancement

Multiple DOA estimation based on estimation consistency and spherical harmonic multiple signal classification

Experimental analysis of optimal window length for independent low-rank matrix analysis

A variational EM method for pole-zero modeling of speech with mixed block sparse and Gaussian excitation

Multi-way regression for age prediction exploiting speech and face image information

A novel filterbank for epoch estimation

Modeling formant dynamics in speech spectral envelopes

Disordered speech quality estimation using linear prediction

Speech intelligibility and quality: A comparative study of speech enhancement algorithms

Privacy-Preserving Understanding of Human Body Orientation for Smart Meetings

Phylogeny analysis for MP3 and AAC coding transformations

Articulation acoustic kinematics in ALS speech

Modernization of formant method of estimation of voice information protection from leakage through technical channels

Robust speaker verification with a two classifier format and feature enhancement

Late reverberation reduction and blind reverberation time measurement for automatic speech recognition

Modified Wiener filtering speech enhancement algorithm with phase spectrum compensation

Auditory mask estimation by RPCA for monaural speech enhancement

Disordered Speech Quality estimation using the Matching Pursuit algorithm

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options