2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

chapter

Organizing Committee

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 1

Provides a listing of current committee members and society officers.

chapter

Metric learning based data augmentation for environmental sound classification

Rui Lu, Zhiyao Duan, Changshui Zhang

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 1 - 5

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

Deep neural networks have been widely applied in the field of environmental sound classification. However, due to the scarcity of carefully labeled data, their training process suffers from over-fitting. Data augmentation is a technique that alleviates this issue. It augments the training set with synthetic data that are created by modifying some parameters of the real data. However, not all kinds...

chapter

Multizone sound reproduction in reverberant environments using an iterative least-squares filter design method with a spatiotemporal weighting function

Michael Buerger, Christian Hofmann, Cornelius Frankenbach, Walter Kellermann

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 1 - 5

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

A previously proposed iterative filter design procedure, referred to as Iterative DFT-domain Inversion (IDI), is applied in the context of multizone sound reproduction in reverberant environments. The IDI approach aims at iteratively solving a least-squares problem, where the true reproduction error in the time domain is considered rather than narrowband errors in the frequency domain. In this paper,...

chapter

Keynotes: Parametric time-frequency-domain spatial audio — Delivering sound according to human spatial resolution

Ville Pulkki

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 1 - 6

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

The application of time-frequency-domain techniques in spatial audio is relatively new, as first attempts were published about 15 years ago. A common property of the techniques is that the sound field is captured with multiple microphones, and its properties are analyzed for each time instance and individually for different frequency bands. These properties can be described by a set of parameters...

chapter

Transfer learning of weakly labelled audio

Aleksandr Diment, Tuomas Virtanen

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 6 - 10

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

Many machine learning tasks have been shown solvable with impressive levels of success given large amounts of training data and computational power. For the problems which lack data sufficient to achieve high performance, methods for transfer learning can be applied. These refer to performing the new task while having prior knowledge of the nature of the data, gained by first performing a different...

chapter

Sound event detection in synthetic audio: Analysis of the dcase 2016 task results

Gregoire Lafay, Emmanouil Benetos, Mathieu Lagrange

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 11 - 15

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

As part of the 2016 public evaluation challenge on Detection and Classification of Acoustic Scenes and Events (DCASE 2016), the second task focused on evaluating sound event detection systems using synthetic mixtures of office sounds. This task, which follows the ‘Event Detection-Office Synthetic’ task of DCASE 2013, studies the behaviour of tested algorithms when facing controlled levels of audio...

chapter

Learning vocal mode classifiers from heterogeneous data sources

Zhao Shuyang, Toni Heittola, Tuomas Virtanen

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 16 - 20

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

This paper targets on a generalized vocal mode classifier (speech/singing) that works on audio data from an arbitrary data source. Previous studies on sound classification are commonly based on cross-validation using a single dataset, without considering training-recognition mismatch. In our study, two experimental setups are used: matched training-recognition condition and mismatched training-recognition...

chapter

Multi-Scale multi-band densenets for audio source separation

Naoya Takahashi, Yuki Mitsufuji

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 21 - 25

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

This paper deals with the problem of audio source separation. To handle the complex and ill-posed nature of the problems of audio source separation, the current state-of-the-art approaches employ deep neural networks to obtain instrumental spectra from a mixture. In this study, we propose a novel network architecture that extends the recently developed densely connected convolutional network (DenseNet),...

chapter

Underdetermined methods for multichannel audio enhancement with partial preservation of background sources

Ryan M. Corey, Andrew C. Singer

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 26 - 30

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

Multichannel audio enhancement and source separation traditionally attempt to isolate a single source and remove all background noise. In listening enhancement applications, however, a portion of the background sources should be retained to preserve the listener's spatial awareness. We describe a time-varying spatial filter designed to apply a different gain to each sound source with minimal distortion...

chapter

A convex optimization approach for time-frequency mask estimation

Feng Bao, Waleed H. Abdulla

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 31 - 35

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

In this paper, we propose a new time-frequency mask method for computational auditory scene analysis (CASA) based on convex optimization of the binary mask. In the proposed method, the pitch estimation and segment segregation in conventional CASA are completely replaced by the convex optimization of speech power. Considering the cross-correlation between the power spectra of noisy speech and noise...

chapter

Music/Voice separation using the 2D fourier transform

Prem Seetharaman, Fatemeh Pishdadian, Bryan Pardo

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 36 - 40

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

Audio source separation is the act of isolating sound sources in an audio scene. One application of source separation is singing voice extraction. In this work, we present a novel approach for music/voice separation that uses the 2D Fourier Transform (2DFT). Our approach leverages how periodic patterns manifest in the 2D Fourier Transform and is connected to research in biological auditory systems...

chapter

Exploiting the intermittency of speech for joint separation and diarization

Dionyssos Kounades-Bastian, Laurent Girin, Xavier Alameda-Pineda, Radu Horaud, more

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 41 - 45

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

Natural conversations are spontaneous exchanges involving two or more people speaking in an intermittent manner. Therefore one expects such conversation to have intervals where some of the speakers are silent. Yet, most (multichannel) audio source separation (MASS) methods consider the sound sources to be continuously emitting on the total duration of the processed mixture. In this paper we propose...

chapter

A novel target speaker dependent postfiltering approach for multichannel speech enhancement

Ritwik Giri, Karim Helwani, Tao Zhang

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 46 - 50

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

In this article, we present a target speaker dependent speech enhancement system, to enhance a specific target talker in presence of real life background noises. The proposed system uses a multi-channel processing stage to produce a noise reference signal. This noise reference signal is further used, to not only compute the residual noise statistics from the multichannel stage output, but also to...

chapter

Explaining the parameterized wiener filter with alpha-stable processes

Mathieu Fontaine, Antoine Liutkus, Laurent Girin, Roland Badeau

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 51 - 55

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

This paper introduces a new method for single-channel denoising that sheds new light on classical early developments on this topic that occurred in the 70's and 80's with Wiener filtering and spectral subtraction. Operating both in the short-time Fourier transform domain, these methods consist in estimating the power spectral density (PSD) of the noise without speech. Then, the clean speech signal...

chapter

An em algorithm for audio source separation based on the convolutive transfer function

Xiaofei Li, Laurent Girin, Radu Horaud

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 56 - 60

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

This paper addresses the problem of audio source separation from (possibly under-determined) multichannel convolutive mixtures. We propose a separation method based on the convolutive transfer function (CTF) in the short-time Fourier transform domain. For strongly reverberant signals, the CTF is a much more appropriate model than the widely-used multiplicative transfer function approximation. An Expectation-Maximization...

chapter

Guiding audio source separation by video object information

Sanjeel Parekh, Slim Essid, Alexey Ozerov, Ngoc Q. K. Duong, more

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 61 - 65

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

In this work we propose novel joint and sequential multimodal approaches for the task of single channel audio source separation in videos. This is done within the popular non-negative matrix factorization framework using information about the sounding object's motion. Specifically, we present methods that utilize non-negative least squares formulation to couple motion and audio information. The proposed...

chapter

Low-Latency approximation of bidirectional recurrent networks for speech denoising

Gordon Wichern, Alexey Lukin

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 66 - 70

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

The ability to separate speech from non-stationary background disturbances using only a single channel of information has increased significantly with the adoption of deep learning techniques. In these approaches, a time-frequency mask that recovers clean speech from noisy mixtures is learned from data. Recurrent neural networks are particularly well-suited to this sequential prediction task, with...

chapter

Low latency sound source separation using convolutional recurrent neural networks

Gaurav Naithani, Tom Barker, Giambattista Parascandolo, Lars Bramslow, more

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 71 - 75

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

Deep neural networks (DNN) have been successfully employed for the problem of monaural sound source separation achieving state-of-the-art results. In this paper, we propose using convolutional recurrent neural network (CRNN) architecture for tackling this problem. We focus on a scenario where low algorithmic delay (< 10 ms) is paramount, and relatively little training data is available. We show...

chapter

PSD estimation of multiple sound sources in a reverberant room using a spherical microphone array

Abdullah Fahim, Prasanga N. Samarasinghe, Thushara D. Abhayapala

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 76 - 80

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

We propose an efficient method to estimate source power spectral densities (PSDs) in a multi-source reverberant environment using a spherical microphone array. The proposed method utilizes the spatial correlation between the spherical harmonics (SH) coefficients of a sound field to estimate source PSDs. The use of the spatial cross-correlation of the SH coefficients allows us to employ the method...

chapter

Joint wideband source localization and acquisition based on a grid-shift approach

Christos Tzagkarakis, W. Bastiaan Kleijn, Jan Skoglund

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 81 - 85

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

This paper addresses the problem of joint wideband localization and acquisition of acoustic sources. The source locations as well as acquisition of the original source signals are obtained in a joint fashion by solving a sparse recovery problem. Spatial sparsity is enforced by discretizing the acoustic scene into a grid of predefined dimensions. In practice, energy leakage from the source location...

INFONA - science communication portal

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

Organizing Committee

Metric learning based data augmentation for environmental sound classification

Multizone sound reproduction in reverberant environments using an iterative least-squares filter design method with a spatiotemporal weighting function

Keynotes: Parametric time-frequency-domain spatial audio — Delivering sound according to human spatial resolution

Transfer learning of weakly labelled audio

Sound event detection in synthetic audio: Analysis of the dcase 2016 task results

Learning vocal mode classifiers from heterogeneous data sources

Multi-Scale multi-band densenets for audio source separation

Underdetermined methods for multichannel audio enhancement with partial preservation of background sources

A convex optimization approach for time-frequency mask estimation

Music/Voice separation using the 2D fourier transform

Exploiting the intermittency of speech for joint separation and diarization

A novel target speaker dependent postfiltering approach for multichannel speech enhancement

Explaining the parameterized wiener filter with alpha-stable processes

An em algorithm for audio source separation based on the convolutive transfer function

Guiding audio source separation by video object information

Low-Latency approximation of bidirectional recurrent networks for speech denoising

Low latency sound source separation using convolutional recurrent neural networks

PSD estimation of multiple sound sources in a reverberant room using a spherical microphone array

Joint wideband source localization and acquisition based on a grid-shift approach

Filter options

Publication date

Keywords

INFONA - science communication portal

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)