Search results

chapter

Assessing User Bias in Affect Detection within Context-Based Spoken Dialog Systems

Syaheerah Lebai Lutfi, Fernando Fernandez-Martinez, Andres Casanova-Garcia, Lorena Lopez-Lebon, more

2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing > 893 - 898

2012 International Conference on Privacy, Security, Risk and Trust (PASSAT)

This paper presents an empirical evidence of user bias within a laboratory-oriented evaluation of a Spoken Dialog System. Specifically, we addressed user bias in their satisfaction judgements. We question the reliability of this data for modeling user emotion, focusing on contentment and frustration in a spoken dialog system. This bias is detected through machine learning experiments that were conducted...

chapter

Implications of political apology or non-apology for politicians in Malaysia: The early findings

Rugayah Hashim, Mohd. Anuar Mazuki, Adzrool Idzwan Ismail, Shaharuddin Badaruddin, more

2012 IEEE Symposium on Business, Engineering and Industrial Applications > 448 - 451

2012 IEEE Symposium on Business, Engineering and Industrial Applications (ISBEIA)

Malaysia's political scene has been full of requests and demands for apology from errant politicians on certain sensitive issues. The media has had a field day covering the various politicians' resentment, annoyance and even rage at the slightest provocation of inadequacy or other accusations on the victim's part, and a righteous stance on the accuser's part. Name-calling and gutter politics are common...

chapter

An LVCSR Based Automatic Scoring Method in English Reading Tests

Junbo Zhang, Fuping Pan, Yonghong Yan

2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics > 1 > 34 - 37

2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC)

This paper describes a reading quality scoring system based on large vocabulary continuous speech recognition (LVCSR). Our previous scoring system was based on forced alignment. A disadvantage of forced alignment based system is it can hardly catch huge kinds of reading miscues, while LVCSR based system avoids this disadvantage. The most challenge was that the LVCSR recognition rate was low on our...

chapter

Can Computers Help Us to Better Understand Different Cultures? Toward a Computer-Based CULINT

Danny Livshits, Newton Howard, Yair Neuman

2012 European Intelligence and Security Informatics Conference > 172 - 179

2012 European Intelligence and Security Informatics Conference (EISIC)

Identifying cultural discrepancies in worldviews is of high priority to Cultural Intelligence (CULINT). This paper presents a CULINT computer-based methodology for increasing cultural awareness. By automatically identifying themes/motifs in textual data and using machine translation, we expose cultural discrepancies in cultural understanding. This novel methodology is empirically tested through the...

chapter

Remote control system of home electrical appliances using speech recognition

Noriyuki Kawarazaki, Tadashi Yoshidome

2012 IEEE International Conference on Automation Science and Engineering (CASE) > 761 - 764

2012 IEEE International Conference on Automation Science and Engineering (CASE 2012)

This paper discusses a remote control system of home electrical appliances using speech recognition. It is very convenient system for not only visual-impaired people but also elderly people to control household appliances based on the speech commands. The goal of our system is that the many kinds of household appliances such as television, video recorder and air conditioner are controlled based on...

chapter

A Study on Prosody of Vietnamese Emotional Speech

Thi Duyen Ngo, The Duy Bui

2012 Fourth International Conference on Knowledge and Systems Engineering > 151 - 155

2012 Fourth International Conference on Knowledge and Systems Engineering (KSE)

This paper describes the analyses of the prosody of Vietnamese emotional speech, accomplished to find the relations between prosodic variations and emotional states in Vietnamese speech. These relations were obtained by investigating the variations of prosodic features in Vietnamese emotional speech in comparison with prosodic features of neutral speech. The analyses were performed on a multi-style...

chapter

Post-processing of the recognized speech for web presentation of large audio archive

Marek Bohac, Karel Blavka, Michaela Kucharova, Svatava Skodova

2012 35th International Conference on Telecommunications and Signal Processing (TSP) > 441 - 445

2012 35th International Conference on Telecommunications and Signal Processing (TSP)

This paper deals with a post-processing phase of automatic transcription of spoken documents stored in the large Czech Radio audio archive (containing hundreds of thousands of recordings). The ultimate goal of the project is to transcribe them and to allow public access to their content. In this paper we focus on methods and algorithms for unsupervised post-processing of automatically recognized recordings...

chapter

Combined articulatory and auditory processing for improved speech recognition

Guangpu Huang, Meng Joo Er

2012 7th IEEE Conference on Industrial Electronics and Applications (ICIEA) > 972 - 977

2012 7th IEEE Conference on Industrial Electronics and Applications (ICIEA)

In this paper, we examined the feasibility of articulatory phonetic inversion (API) conditioned on the auditory qualities for improved speech recognition. And we introduced an efficient data-driven heuristic learning algorithm to capture the articulatory-phonetic features (APFs) of English speech. Then we reported the performance of the combined auditory and articulatory processing methods in the...

chapter

An efficient parametric model for real-time 3D tongue skeletal animation

Mihai Daniel Ilie, Cristian Negrescu, Dumitru Stanomir

2012 9th International Conference on Communications (COMM) > 129 - 132

2012 9th International Conference on Communications (COMM)

In this paper, we propose a very efficient novel parametric model to describe the surface and structure of the human tongue and a corresponding mathematical model for performing 3D tongue animation. A skeletal chain of virtual bones is automatically generated depending on the geometric features of the 3D object, allowing each tongue segment to be easily manipulated by its corresponding parameters,...

chapter

Short Utterance Speaker Recognition A research Agenda

Nakhat Fatima, Thomas Fang Zheng

2012 International Conference on Systems and Informatics (ICSAI2012) > 1746 - 1750

2012 International Conference on Systems and Informatics (ICSAI)

Short Utterance Speaker Recognition (SUSR) is an important area of speaker recognition when only small amount of speech data is available for testing and training. We list the most commonly used state-of-the-art methods of speaker recognition and the significance of prosodic speaker recognition. A short survey of SUSR is hereby conducted, highlighting various methodologies when using short utterances...

chapter

The extraction and simulation of Mel frequency cepstrum speech parameters

Hongyu Xu, Xia Zhang, Liang Jia

2012 International Conference on Systems and Informatics (ICSAI2012) > 1765 - 1768

2012 International Conference on Systems and Informatics (ICSAI)

This paper takes consideration of (voices of) the characteristics of voice processing by the human auditory system, adopts triangle filter to do signal preprocessing, and uses logarithm operations of all filter output for extracting Mel frequency cepstrum Coefficient (MFCC). By Matlab simulation of MFCC vectors of typical signal of male and female, an analyses is given of the probability to be applied...

chapter

How different kinds of sound in videos can influence gaze

Guanghan Song, Denis Pellerin, Lionel Granjon

2012 13th International Workshop on Image Analysis for Multimedia Interactive Services > 1 - 4

2012 13th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)

This paper presents an analysis of the effect of thirteen different kinds of sound on visual gaze when looking freely at videos to help to predict eye positions. First, an audio-visual experiment was designed with two groups of participants, with audio-visual (AV) and visual (V) conditions, to test the sound effect. Then, an audio experiment was designed to validate the classification of sound we...

chapter

A novel approach to identify problematic call center conversations

Meghna Abhishek Pandharipande, Sunil Kumar Kopparapu

2012 Ninth International Conference on Computer Science and Software Engineering (JCSSE) > 1 - 5

2012 International Joint Conference on Computer Science and Software Engineering (JCSSE)

Voice based call centers enable customers to query for information by speaking to agents in the call center. Most often these call conversations are recorded for analysis with the intent of trying to identify things that can help improve the performance of the call center to serve the customer better. Today the recorded conversations are analyzed by humans by listening to call conversations, which...

chapter

Electrolarynx Speech Enhancement Using a Minimum Mean-Square Error Spectral Amplitude Estimator

Ya Bai, Ming Xi Wan, Su Pin Wang

2012 International Conference on Biomedical Engineering and Biotechnology > 793 - 796

2012 International Conference on Biomedical Engineering and Biotechnology (iCBEB)

Although electro larynx speech provides an important means for the laryngectomees for oral communication, the resulting speech is of poor intelligibility due to the radiated noise caused by the instrument. This paper concentrates here on the derivation of a minimum mean-square error spectral amplitude estimator, and on its application in electro larynx speech enhancement, also, the frequency domain...

chapter

A novel approach for emotion classification based on fusion of text and speech

Ali Houjeij, Layla Hamieh, Nader Mehdi, Hazem Hajj

2012 19th International Conference on Telecommunications (ICT) > 1 - 6

2012 19th International Conference on Telecommunications (ICT)

In this paper we design a system that adopts a novel approach for emotional classification from human dialogue based on text and speech context. Our main objective is to boost the accuracy of speech emotional classification by accounting for the features extracted from the spoken text. The proposed system concatenates text and speech features and feeds them as one input to the classifier. The work...

chapter

Performance of the OZU speaker verification systems with the NIST SRE 2010 data in a multi-class scenario

Fatih Yesil, Cenk Demiroglu

2012 20th Signal Processing and Communications Applications Conference (SIU) > 1 - 4

2012 20th Signal Processing and Communications Applications Conference (SIU)

Performance of the speaker verification systems is typically measured based on their binary decision accuracy. However, in speaker verification applications where close to %100 accuracy is required, such as the systems that are used in the call centers of finance companies, it is not possible to rely on the binary decisions of the existing verification systems. Still, in such cases, multi-class verification...

chapter

Topic identification based extrinsic evaluation of summarization techniques applied to conversational speech

David Harwath, Timothy J. Hazen

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5073 - 5076

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Document summarization algorithms are most commonly evaluated according to the intrinsic quality of the summaries they produce. An alternate approach is to examine the extrinsic utility of a summary, measured by the ability of the summary to aid a human in the completion of a specific task. In this paper, we use topic identification as a proxy for relevancy determination in the context of an information...

chapter

Classification of emotional content of sighs in dyadic human interactions

Rahul Gupta, Chi-Chun Lee, Shrikanth Narayanan

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2265 - 2268

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Emotions are an important part of human communication and are expressed both verbally and non-verbally. Common nonverbal vocalizations such as laughter, cries and sighs carry important emotional content in conversations. Sighs often are associated with negative emotion. In this work, we show that emotional sighs exist along both ends of the valence axis (positive-emotion vs. negative-emotion sighs)...

chapter

Machine recognition vs human recognition of voices

Stanley J. Wenndt, Ronald L. Mitchell

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4245 - 4248

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

While automated speaker recognition by machines can be quite good as seen in NIST Speaker Recognition Evaluations, performance can still suffer when the environmental conditions, emotions, or recording quality changes. This research examines how robust humans are compared to machine recognition for changing environments. Several data conditions including short sentences, frequency selective noise,...

chapter

Generalized F0 modelling with absolute and relative pitch features for singing voice synthesis

S. W. Lee, Shen Ting Ang, Minghui Dong, Haizhou Li

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 429 - 432

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Natural pitch fluctuations are essential to human singing. To effectively synthesize singing voice, the generation of these pitch fluctuations is necessary. Previous synthesis methods classify and reproduce them individually. These fluctuations, however, are found to be dependent and vary under different contexts. This paper proposes a generalized framework for F0 modelling to learn and generate these...

INFONA - science communication portal

Search results

Assessing User Bias in Affect Detection within Context-Based Spoken Dialog Systems

Implications of political apology or non-apology for politicians in Malaysia: The early findings

An LVCSR Based Automatic Scoring Method in English Reading Tests

Can Computers Help Us to Better Understand Different Cultures? Toward a Computer-Based CULINT

Remote control system of home electrical appliances using speech recognition

A Study on Prosody of Vietnamese Emotional Speech

Post-processing of the recognized speech for web presentation of large audio archive

Combined articulatory and auditory processing for improved speech recognition

An efficient parametric model for real-time 3D tongue skeletal animation

Short Utterance Speaker Recognition A research Agenda

The extraction and simulation of Mel frequency cepstrum speech parameters

How different kinds of sound in videos can influence gaze

A novel approach to identify problematic call center conversations

Electrolarynx Speech Enhancement Using a Minimum Mean-Square Error Spectral Amplitude Estimator

A novel approach for emotion classification based on fusion of text and speech

Performance of the OZU speaker verification systems with the NIST SRE 2010 data in a multi-class scenario

Topic identification based extrinsic evaluation of summarization techniques applied to conversational speech

Classification of emotional content of sighs in dyadic human interactions

Machine recognition vs human recognition of voices

Generalized F0 modelling with absolute and relative pitch features for singing voice synthesis

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options