Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
This paper compares different approaches for using deep neural networks (DNNs) trained to predict senone posteriors for the task of spoken language recognition (SLR). These approaches have recently been found to outperform various baseline systems on different datasets, but they have not yet been compared to each other or to a common baseline. Two of these approaches use the DNNs to generate feature...
The recent application of deep neural networks (DNN) to speaker identification (SID) has resulted in significant improvements over current state-of-the-art on telephone speech. In this work, we report a similar achievement in DNN-based SID performance on microphone speech. We consider two approaches to DNN-based SID: one that uses the DNN to extract features, and another that uses the DNN during feature...
Prosody is the part of speech where rhythm, stress, and intonation are reflected. In language identification tasks, these characteristics are assumed to be language dependent, and thus the language can be identified from them. In this paper, an automatic language recognition system that extracts prosody information from speech and makes decisions about the language with a generative classifier based...
We describe recent progress in the field of prosodic modeling for speaker verification. In a previous paper, we proposed a technique for modeling syllable-based prosodic features that uses a multinomial subspace model for feature extraction and within-class covariance normalization or linear discriminant analysis for session variability compensation. In this paper, we show that performance can be...
The goal of this work was to explore the optimization of the feature extraction module (front-end) parameters to improve bird species recognition. We explored optimizing the spectral and temporal parameters of a Mel cepstrum feature-based front-end, starting from common parameter values used in speech processing experiments. These features were modeled using a Gaussian mixture model (GMM) system....
Prosodic information has been successfully used for speaker recognition for more than a decade. The best-performing prosodic system to date has been one based on features extracted over syllables obtained automatically from speech recognition output. The features are then transformed using a Fisher kernel, and speaker models are trained using support vector machines (SVMs). Recently, a simpler version...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.