Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
This paper describes implementation of the speech synthesizer using a diphone database in parametric format. The paper deals also with harmonic models with noise (HNM) belonging to sinusoidal model used for database segments coding. The HNM approach of speech representation allows us to reduce significantly the database of speech segments for a concatenative speech synthesis, too.
This paper focuses on using bigrams in a topic determination for speech synthesizer. It contains an explanation of a modular architecture for the speech synthesizer and importance of context analysis for customizing and quality enhancement of synthesized speech. The bigram carries information about context and in this work it is shown how to use them to improve the identification of the theme. At...
In this work we were able to create module to speech synthesizer. The main function of this module is to change fundamental frequency. There are many possibilities how to change shape of melodic curve. We picked up method to modify fundamental frequency with PRAAT script. In this work we also analyzed some types of Slovak sentences with focus on prosodic curves.
This article discusses the impact of substituting some of the basic speech features with the voiced/ unvoiced information and possibly with the estimated pitch value. As a good measure of the signal's voicing the average magnitude difference function was assumed, especially the ratio of its average value to its local minima found within the accepted ranges of the pitch. Furthermore, the pitch itself...
One of the main topics in the area of user-friendly human machine interface is speech synthesis. The paper deals with the application of harmonic plus noise model (HNM) for preparation of compressed speech database in a format that is that is useful for easy prosodic modification of the synthesized speech. The HNM approach of speech representation allows us to reduce significantly the database of...
This article describes an extension of HNM model used for prosodic modification of speech. HNM model represents speech signals as a sum of harmonic and noise part. The decomposition of speech signal into these two parts allows more natural sounding modifications of the signal. The parametric representation of speech provides a straightforward way of changing prosodic features of the speech. Our algorithm,...
In this paper we discuss a topic of an automatic speech recognition system based on a system SPHINX in various versions and configurations. We compare Sphinx version 3 and 4 for recognition of Slovak speech. Other comparison is focused on the type of a language model. We have used regular grammar and bi-gram language model as compared language model.
In this article, a design of an improvement of a TTS system for Slovak speech synthesis is described. The improvement consists of a new type of parametric corpus storing, that allows prosodic modification of the speech in real-time processing. This approach is based on a HNM (harmonic plus noise) model that represents speech signals as a sum of a harmonic and a noise part. The decomposition of speech...
The paper introduces an idea of a metropolitan information system. The aim of the system is to provide various kinds of information about the city not only for tourists and strangers but for the citizens of the city, as well. The main principle is based on a philosophy of accessing data from the Internet and to provide a user-friendly interface to these data using various types of intelligent kiosks...
In this paper we describe a proposal of a multimedia kiosk that can be used to provide diverse information to the wide public. We propose two versions of the intelligent kiosk. First version is placed on public places and offer three-dimensional human head displayed on a large display that gives information about city, institutions, weather, etc. It is a system with integrated microphone array, camera...
This project is about speech synthesis and creating a speech synthesiser for a mobile cell phone. The first part of this project is about speech synthesis. From the all type of synthesis only diphones synthesis is discussed further, because its features for a mobile cell phone are superior, compared to the other types. This work further analyses implementation of speech synthesiser -this means loading...
Pitch period estimation (also called fundamental frequency estimation) is widely needed in speech processing for many purposes. In our system for prosodic modification of speech, the pitch period estimation is used as a basis for frame length detection. The pitch period estimation method used in the system is a hybrid method that is based on YIN fundamental frequency estimation algorithm and a method...
In the submitted paper we present the training process of HMM models that are designed to be used in ASR systems employed in GSM networks. First a brief overview regarding the current problems and applications of ASR systems is given, followed by the description of MOBILDAT-SK speech database and the SPHINX 4 and SphitixTrain capabilities. Then the process of HMM models training is presented utilizing...
Several new original algorithms for face feature detection are presented within this paper. The detected objects are used to produce realistic 3D model of human face. Presented methods have been tested and the results are discussed in the paper. The facial feature detection is based on human skin chromaticity and morphological characteristics of the human head. Output of the skin detection is used...
In this work a design and a realization of multimodal microphone array algorithm development system which is proposed to develop new microphone array algorithm named MABox is presented. This device incorporates microphone array with four microphones, ADC cards and development software. Microphones are integrated in a separate directional box pointed to the speaker, the box is connected via USB and...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.