Wyniki wyszukiwania dla: Björn Schuller

Pozycje od 101 do 112 spośród 112 wyników

Poprzednia

Następna

rozdział

Audio recognition in the wild: Static and dynamic classification on a real-world database of animal vocalizations

Felix Weninger, Bjorn Schuller

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 337 - 340

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We present a study on purely data-based recognition of animal sounds, performing evaluation on a real-world database obtained from the Humboldt-University Animal Sound Archive. As we avoid a preselection of friendly cases, the challenge for the classifiers is to discriminate between species regardless of the age or stance of the animal. We define classification tasks that can be useful for information...

rozdział

Localization of non-linguistic events in spontaneous speech by Non-Negative Matrix Factorization and Long Short-Term Memory

Felix Weninger, Bjorn Schuller, Martin Wollmer, Gerhard Rigoll

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5840 - 5843

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Features generated by Non-Negative Matrix Factorization (NMF) have successfully been introduced into robust speech processing, including noise-robust speech recognition and detection of non-linguistic vocalizations. In this study, we introduce a novel tandem approach by integrating likelihood features derived from NMF into Bidirectional Long Short-Term Memory Recurrent Neural Networks (BLSTM-RNNs)...

rozdział

Combining monaural source separation with Long Short-Term Memory for increased robustness in vocalist gender recognition

Felix Weninger, Jean-Louis Durrieu, Florian Eyben, Gael Richard, więcej

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2196 - 2199

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We present a novel and unique combination of algorithms to detect the gender of the leading vocalist in recorded popular music. Building on our previous successful approach that enhanced the harmonic parts by means of Non-Negative Matrix Factorization (NMF) for increased accuracy, we integrate on the one hand a new source separation algorithm specifically tailored to extracting the leading voice from...

rozdział

A multi-stream ASR framework for BLSTM modeling of conversational speech

Martin Wollmer, Florian Eyben, Bjorn Schuller, Gerhard Rigoll

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4860 - 4863

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We propose a novel multi-stream framework for continuous conversational speech recognition which employs bidirectional Long Short-Term Memory (BLSTM) networks for phoneme prediction. The BLSTM architecture allows recurrent neural nets to model long-range context, which led to improved ASR performance when combined with conventional triphone modeling in a Tandem system. In this paper, we extend the...

rozdział

Come and have an emotional workout with sensitive artificial listeners!

M Schroder, S Pammi, H Gunes, M Pantic, więcej

Face and Gesture 2011 > 646

2011 IEEE International Conference on Automatic Face & Gesture Recognition (FG 2011)

This demonstration aims to showcase the recently completed SEMAINE system. The SEMAINE system is a publicly available, fully autonomous Sensitive Artificial Listeners (SAL) system that consists of virtual dialog partners based on audiovisual analysis and synthesis (see http://semaine.opendfki.de/wiki). The system runs in real-time, and combines incremental analysis of user behavior, dialog management,...

artykuł

Recognizing Affect from Linguistic Information in 3D Continuous Space

Bjorn Schuller

IEEE Transactions on Affective Computing > 2011 > 2 > 4 > 192 - 205

Most research efforts dealing with recognition of emotion-related states from the human speech signal concentrate on acoustic analysis. However, the last decade's research results show that the task cannot be solved to complete satisfaction, especially when it comes to real life speech data and in particular to the assessment of speakers' valence. This paper therefore investigates novel approaches...

rozdział

Late fusion of individual engines for improved recognition of negative emotion in speech - learning vs. democratic vote

Bjorn Schuller, Florian Metze, Stefan Steidl, Anton Batliner, więcej

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 5230 - 5233

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

The fusion of multiple recognition engines is known to be able to outperform individual ones, given sufficient independence of methods, models, and knowledge sources. We therefore investigate late fusion of different speech-based recognizers of emotion. Two generally different streams of information are considered: acoustics and linguistics fed by state-of-the-art automatic speech recognition. A total...

rozdział

Spoken term detection with Connectionist Temporal Classification: A novel hybrid CTC-DBN decoder

Martin Wöllmer, Florian Eyben, Bjorn Schuller, Gerhard Rigoll

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 5274 - 5277

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper proposes a novel system for robust keyword detection in continuous speech. Our decoder is composed of a bidirectional Long Short-Term Memory recurrent neural network using a Connectionist Temporal Classification (CTC) output layer, and a Dynamic Bayesian Network (DBN). The CTC network exploits bidirectional context information to reliably identify phonemes, whereas the DBN is able to discriminate...

rozdział

Learning with synthesized speech for automatic emotion recognition

Bjorn Schuller, Felix Burkhardt

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 5150 - 5153

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

Data sparseness is an ever dominating problem in automatic emotion recognition. Using artificially generated speech for training or adapting models could potentially ease this: though less natural than human speech, one could synthesize the exact spoken content in different emotional nuances - of many speakers and even in different languages. To investigate chances, the phonemisation components Txt2Pho...

rozdział

Suspicious Behavior Detection in Public Transport by Fusion of Low-Level Video Descriptors

Dejan Arsic, Bjorn Schuller, Gerhard Rigoll

Multimedia and Expo, 2007 IEEE International Conference on > 2018 - 2021

2007 IEEE International Conference on Multimedia and Expo

Recently great interest has been shown in the visual surveillance of public transportation systems. The challenge is the automated analysis of passenger's behaviors with a set of visual low-level features, which can be extracted robustly. On a set of global motion features computed in different parts of the image, here the complete image, the face and skin color regions, a classification with Support...

rozdział

Towards More Reality in the Recognition of Emotional Speech

Bjorn Schuller, Dino Seppi, Anton Batliner, Andreas Maier, więcej

2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '7 > 4 > IV-941 - IV-944

2007 IEEE International Conference on Acoustics, Speech, and Signal Processing

As automatic emotion recognition based on speech matures, new challenges can be faced. We therefore address the major aspects in view of potential applications in the field, to benchmark today's emotion recognition systems and bridge the gap between commercial interest and current performances: acted vs. spontaneous speech, realistic emotions, noise and microphone conditions, and speaker independence...

rozdział

Submotions for Hidden Markov Model Based Dynamic Facial Action Recognition

Dejan Arsic, Joachim Schenk, Bjorn Schuller, Frank Wallhoff, więcej

2006 International Conference on Image Processing > 673 - 676

2006 International Conference on Image Processing

Video based analysis of a persons' mood or behavior is in general performed by interpreting various features observed on the body. Facial actions, such as speaking, yawning or laughing are considered as key features. Dynamic changes within the face can be modeled with the well known hidden Markov models (HMM). Unfortunately even within one class examples can show a high variance because of unknown...

Poprzednia

Następna

Opcje filtrowania

Zbiór danych:
ieee

Data publikacji

Ustaw własny zakres dat

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania dla: Björn Schuller

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Typ publikacji

Słowa kluczowe

Czasopismo

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu