Search results for: Lukas Drude

Items from 1 to 12 out of 12 results

article

A generic neural acoustic beamforming architecture for robust multi-channel speech processing

Jahn Heymann, Lukas Drude, Reinhold Haeb-Umbach

Computer Speech & Language > 2017 > 46 > C > 374-385

Acoustic beamforming can greatly improve the performance of Automatic Speech Recognition(ASR) and speech enhancement systems when multiple channels are available. We recently proposed a way to support the model-based Generalized Eigenvalue beamforming operation with a powerful neural network for spectral mask estimation. The enhancement system has a number of desirable properties. In particular, neither...

chapter

Multi-stage coherence drift based sampling rate synchronization for acoustic beamforming

Joerg Schmalenstroeer, Jahn Heymann, Lukas Drude, Christoph Boeddecker, more

2017 IEEE 19th International Workshop on Multimedia Signal Processing (MMSP) > 1 - 6

2017 IEEE 19th International Workshop on Multimedia Signal Processing (MMSP)

Multi-channel speech enhancement algorithms rely on a synchronous sampling of the microphone signals. This, however, cannot always be guaranteed, especially if the sensors are distributed in an environment. To avoid performance degradation the sampling rate offset needs to be estimated and compensated for. In this contribution we extend the recently proposed coherence drift based method in two important...

chapter

Optimizing neural-network supported acoustic beamforming by algorithmic differentiation

Christoph Boeddeker, Patrick Hanebrink, Lukas Drude, Jahn Heymann, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 171 - 175

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper we show how a neural network for spectral mask estimation for an acoustic beamformer can be optimized by algorithmic differentiation. Using the beamformer output SNR as the objective function to maximize, the gradient is propagated through the beamformer all the way to the neural network which provides the clean speech and noise masks from which the beamformer coefficients are estimated...

chapter

Beamnet: End-to-end training of a beamformer-supported multi-channel ASR system

Jahn Heymann, Lukas Drude, Christoph Boeddeker, Patrick Hanebrink, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5325 - 5329

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper presents an end-to-end training approach for a beamformer-supported multi-channel ASR system. A neural network which estimates masks for a statistically optimum beamformer is jointly trained with a network for acoustic modeling. To update its parameters, we propagate the gradients from the acoustic model all the way through feature extraction and the complex valued beamforming operation...

chapter

Neural network based spectral mask estimation for acoustic beamforming

Jahn Heymann, Lukas Drude, Reinhold Haeb-Umbach

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 196 - 200

ICASSP 2016 - 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We present a neural network based approach to acoustic beamforming. The network is used to estimate spectral masks from which the Cross-Power Spectral Density matrices of speech and noise are estimated, which in turn are used to compute the beamformer coefficients. The network training is independent of the number and the geometric configuration of the microphones. We further show that it is possible...

chapter

Blind speech separation based on complex spherical k-mode clustering

Lukas Drude, Christoph Boeddeker, Reinhold Haeb-Umbach

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 141 - 145

ICASSP 2016 - 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We present an algorithm for clustering complex-valued unit length vectors on the unit hypersphere, which we call complex spherical k-mode clustering, as it can be viewed as a generalization of the spherical k-means algorithm to normalized complex-valued vectors. We show how the proposed algorithm can be derived from the Expectation Maximization algorithm for complex Watson mixture models and prove...

chapter

BLSTM supported GEV beamformer front-end for the 3RD CHiME challenge

Jahn Heymann, Lukas Drude, Aleksej Chinaev, Reinhold Haeb-Umbach

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) > 444 - 451

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

We present a new beamformer front-end for Automatic Speech Recognition and apply it to the 3rd-CHiME Speech Separation and Recognition Challenge. Without any further modification of the back-end, we achieve a 53% relative reduction of the word error rate over the best baseline enhancement system for the relevant test data set. Our approach leverages the power of a bi-directional Long Short-Term Memory...

chapter

DOA-estimation based on a complex Watson kernel method

Lukas Drude, Florian Jacob, Reinhold Haeb-Umbach

2015 23rd European Signal Processing Conference (EUSIPCO) > 255 - 259

2015 23rd European Signal Processing Conference (EUSIPCO)

This contribution presents a Direction of Arrival (DoA) estimation algorithm based on the complex Watson distribution to incorporate both phase and level differences of captured microphone array signals. The derived algorithm is reviewed in the context of the Generalized State Coherence Transform (GSCT) on the one hand and a kernel density estimation method on the other hand. A thorough simulative...

chapter

Source counting in speech mixtures by nonparametric Bayesian estimation of an infinite Gaussian mixture model

Oliver Walter, Lukas Drude, Reinhold Haeb-Umbach

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 459 - 463

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper we present a source counting algorithm to determine the number of speakers in a speech mixture. In our proposed method, we model the histogram of estimated directions of arrival with a non-parametric Bayesian infinite Gaussian mixture model. As an alternative to classical model selection criteria and to avoid specifying the maximum number of mixture components in advance, a Dirichlet...

chapter

Towards online source counting in speech mixtures applying a variational EM for complex Watson mixture models

Lukas Drude, Aleksej Chinaev, Dang Hai Tran Vu, Reinhold Haeb-Umbach

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC) > 213 - 217

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC)

This contribution describes a step-wise source counting algorithm to determine the number of speakers in an offline sce-nario. Each speaker is identified by a variational expectation maximization (VEM) algorithm for complex Watson mixture models and therefore directly yields beamforming vectors for a subsequent speech separation process. An observation selection criterion is proposed which improves...

article

Photovoltaics (PV) and electric vehicle-to-grid (V2G) strategies for peak demand reduction in urban regions in Brazil in a smart grid environment

Lukas Drude, Luiz Carlos Pereira Junior, Ricardo Rüther

Renewable Energy > 2014 > 68 > Complete > 443-451

Vehicle-to-grid (V2G) energy transfer in a smart grid environment opens a new revenue opportunity for electric-drive vehicles (EVs), and might reduce grid operation costs in demand-constrained urban feeders where peak-electricity prices are high. This paper analyses the peak demand energy market for V2G in the urban region of Florianópolis, Brazil. The article describes known V2G-concepts and introduces...

chapter

Source counting in speech mixtures using a variational EM approach for complex WATSON mixture models

Lukas Drude, Aleksej Chinaev, Dang Hai Tran Vu, Reinhold Haeb-Umbach

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 6834 - 6838

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this contribution we derive a variational EM (VEM) algorithm for model selection in complex Watson mixture models, which have been recently proposed as a model of the distribution of normalized microphone array signals in the short-time Fourier transform domain. The VEM algorithm is applied to count the number of active sources in a speech mixture by iteratively estimating the mode vectors of the...

Filter options

Publication date

Set your own date range

INFONA - science communication portal

Search results for: Lukas Drude

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Data set

Journal

Reporting an error / abuse

Sending the report failed

Accessibility options