Search results for: Padmanabhan Rajan

Items from 1 to 9 out of 9 results

chapter

Rényi entropy based mutual information for semi-supervised bird vocalization segmentation

Anshul Thakur, Vinayak Abrol, Pulkit Sharma, Padmanabhan Rajan

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) > 1 - 6

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)

In this paper we describe a semi-supervised algorithm to segment bird vocalizations using matrix factorization and Rényi entropy based mutual information. Singular value decomposition (SVD) is applied on pooled time-frequency representations of bird vocalizations to learn basis vectors. By utilizing only a few of the bases, a compact feature representation is obtained for input test data. Rényi entropy...

chapter

Rapid bird activity detection using probabilistic sequence kernels

Anshul Thakur, R. Jyothi, Padmanabhan Rajan, A.D. Dileep

2017 25th European Signal Processing Conference (EUSIPCO) > 1754 - 1758

2017 25th European Signal Processing Conference (EUSIPCO)

Bird activity detection is the task of determining if a bird sound is present in a given audio recording. This paper describes a bird activity detector which utilises a support vector machine (SVM) with a dynamic kernel. Dynamic kernels are used to process sets of feature vectors having different cardinalities. Probabilistic sequence kernel (PSK) is one such dynamic kernel. The PSK converts a set...

chapter

Unsupervised birdcall activity detection using source and system features

Anshul Thakur, Padmanabhan Rajan

2017 Twenty-third National Conference on Communications (NCC) > 1 - 6

2017 Twenty-third National Conference on Communications (NCC)

In this paper, we describe an unsupervised method to segment birdcalls from the background in bioacoustic recordings. The method utilizes information derived from both source features as well as system features. Three types of source features are extracted from the linear prediction residual signal, and Mel frequency cepstral coefficients are extracted from the system features. The source features...

chapter

Model-based unsupervised segmentation of birdcalls from field recordings

Anshul Thakur, Padmanabhan Rajan

2016 10th International Conference on Signal Processing and Communication Systems (ICSPCS) > 1 - 6

2016 10th International Conference on Signal Processing and Communication Systems (ICSPCS)

In this paper, we describe an unsupervised, species independent method to segment birdcalls from the background in bio-acoustic recordings. The method follows a two-pass approach. An initial segmentation is performed utilizing K-means clustering. This provides labels to train Gaussian mixture acoustic models, which are built using Mel frequency cepstral coefficients. Using the acoustic models, the...

chapter

Bird Call Identification Using Dynamic Kernel Based Support Vector Machines and Deep Neural Networks

Deep Chakraborty, Paawan Mukker, Padmanabhan Rajan, A. D. Dileep

2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA) > 280 - 285

2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)

In this paper, we apply speech and audio processing techniques to bird vocalizations and for the classification of birds found in the lower Himalayan regions. Mel frequency cepstral coefficients (MFCC) are extracted from each recording. As a result, the recordings are now represented as varying length sets of feature vectors. Dynamic kernel based support vector machines (SVMs) and deep neural networks...

chapter

Group delay functions for speaker diarization

Mohit Yadav, Anil Kumar Sao, A D Dileep, Padmanabhan Rajan

2016 Twenty Second National Conference on Communication (NCC) > 1 - 5

2016 Twenty Second National Conference on Communication (NCC)

Speaker diarization is the task of determining “who spoke when” in a speech recording of an unknown duration containing an unknown number of speakers. The very unsupervised nature of this task makes it more challenging and demands that the feature representation used be highly discriminative across speakers. Commonly used features based on the short time Fourier transform are usually derived from...

article

From single to multiple enrollment i-vectors: Practical PLDA scoring variants for speaker verification

Padmanabhan Rajan, Anton Afanasyev, Ville Hautamäki, Tomi Kinnunen

Digital Signal Processing > 2014 > 31 > Complete > 93-101

The availability of multiple utterances (and hence, i-vectors) for speaker enrollment brings up several alternatives for their utilization with probabilistic linear discriminant analysis (PLDA). This paper provides an overview of their effective utilization, from a practical viewpoint. We derive expressions for the evaluation of the likelihood ratio for the multi-enrollment case, with details on the...

chapter

A practical, self-adaptive voice activity detector for speaker verification with noisy telephone and microphone data

Tomi Kinnunen, Padmanabhan Rajan

2013 IEEE International Conference on Acoustics, Speech and Signal Processing > 7229 - 7233

ICASSP 2013 - 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

A voice activity detector (VAD) plays a vital role in robust speaker verification, where energy VAD is most commonly used. Energy VAD works well in noise-free conditions but deteriorates in noisy conditions. One way to tackle this is to introduce speech enhancement preprocessing. We study an alternative, likelihood ratio based VAD that trains speech and nonspeech models on an utterance-by-utterance...

chapter

Multi-layer perceptron based speech activity detection for speaker verification

Sriram Ganapathy, Padmanabhan Rajan, Hynek Hermansky

2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 321 - 324

2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

In this paper, we present a speech activity detection (SAD) technique for speaker verification in noisy environments. The proposed SAD is based on phoneme posteriors derived from a multi-layer perceptron (MLP). The MLP is trained using modulation spectral features, where long temporal segments of the speech signal are analyzed in critical bands. In each sub-band, temporal envelopes are derived using...

Filter options

Publication date

Set your own date range

INFONA - science communication portal

Search results for: Padmanabhan Rajan

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Data set

Reporting an error / abuse

Sending the report failed

Accessibility options