The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this study, a hierarchical pitch target model is proposed to analyze the underlying factors of tones and intonation in Mandarin pitch, which can be applied in speech synthesis systems. This model assumes that the surface pitch contour is produced by approximating the sequential pitch targets assigned to the syllables in an utterance, and each pitch target possesses a hierarchical structure. Moreover,...
This paper investigates lexical stress detection for Chinese learners of English, where a combined differential acoustic feature is developed to represent the lexical stress of polysyllabic words in continuous speech. The use of frame-averaged feature and the contextual information intra-word can be input to the classifiers without normalization. The word-based stress detection method proposed in...
This paper describes the system submitted by Loquendo and Politecnico di Torino (LPT) for the 2009 NIST Language Recognition Evaluation. The system is a combination of classifiers based on two core acoustic models and on two core phone tokenizers. It exploits several state-of-the-art techniques that have been successfully applied in recent years both in speaker and in language recognition.
Information distillation is the task that aims to extract relevant passages of text from massive volumes of textual and audio sources, given a query. In this paper, we investigate two perspectives that use shallow language processing for answering open-ended distillation queries, such as “List me facts about [event]”. The first approach is a summarization-based approach that uses the unsupervised...
In this paper, we describe our efforts toward the automatic detection of English questions in meetings. We analyze the utility of various features for this task, originating from three distinct classes: lexico-syntactic, turn-related, and pitch-related. Of particular interest is the use of parse tree information in classification, an approach as yet unexplored. Results from experiments on the ICSI...
This paper describes pitch tracking techniques, which combine voiced/unvoiced classification and pitch estimation based on cepstral analysis, time autocorrelation, spectro-temporal autocorrelation (STA) and average magnitude difference function (AMDF). Pre- and post processing techniques improving performance of pitch detection algorithms (PDAs) are also presented. PDAs have been evaluated by telephone...
Dominance - a behavioral expression of power - is a fundamental mechanism of social interaction, expressed and perceived in conversations through spoken words and audiovisual nonverbal cues. The automatic modeling of dominance patterns from sensor data represents a relevant problem in social computing. In this paper, we present a systematic study on dominance modeling in group meetings from fully...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.