The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this article we describe the first Text To Speech (TTS) system for the Greek language based on Festival architecture. We discuss practical implementation details and we capitalize on the preparation of the diphone database and on the prediction of phoneme duration module implemented with CART tree technique. Two male databases where used for two different speech synthesis engines, namely, residual...
The present work contributes to the field of generalized sound classification. We extensively examine the performance of the next three feature sets: a) MPEG-7 Audio Spectrum Projection, b) MFCC (using an alternative method for their extraction) and c) a group derived utilizing critical band based wavelet packets. Subsequently three types of temporal feature integration strategies are applied on the...
The present paper describes the construction of a multimodal database, referred to as the PROMETHEUS database, which contains recordings from heterogeneous sensors. The main purpose of this database is the development of a framework for monitoring and interpretation of human behavior in unrestricted environments of both indoor and outdoor type. It contains single-person and multi-person scenarios,...
The present study presents a practical methodology for automatic space monitoring based solely on the perceived acoustic information. We consider the case where atypical situations such as screams, explosions and gunshots take place in a metro station environment. Our approach is based on a two stage recognition schema, each one exploiting HMMs for approximating the density function of the corresponding...
This work reports recent progress towards the development of a pilot system for automatic identification of singing insects. We propose a sound parameterization technique that is designed explicitly for the needs of acoustic insect recognition. It is combined with state-of-the-art classification methods that dominate speaker recognition technology. Specifically, the categorization of acoustic emissions...
The general problem addressed in this work is automatic identification of insects using only the acoustic modality. In particular, we discuss the characteristics of the acoustic profiles of two target groups of insects: crickets and cicadas. Subsequently, we employ advanced machine learning techniques to categorize them on the levels of specific insect, family, subfamily, genus, and species. To deal...
Our work introduces a speech enhancement technique that can explicitly incorporate prior information about the gender or speaker time-frequency characteristics in its formalism. We approximate the multimodal, clean speech linear spectrum magnitude with a mixture of Gaussians pdfs using the Expectation-Maximization algorithm (EM). Subsequently. we apply the Bayesian inference framework to the degraded...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.