Search results

Items from 1 to 20 out of 42 results

chapter

Automatic scoring method considering quality and content of speech for scat Japanese speaking test

Naoko Okubo, Yuto Yamahata, Takeshi Yamada, Shingo Imai, more

2012 International Conference on Speech Database and Assessments > 72 - 77

2012 Oriental COCOSDA 2012 - International Conference on Speech Database and Assessments

We are now developing a Japanese speaking test called SCAT, which is part of J-CAT (Japanese Computerized Adaptive Test), a free online proficiency test for Japanese language learners. In this paper, we focus on the sentence-reading-aloud task and the sentence generation task in SCAT, and propose an automatic scoring method for estimating the overall score of answer speech, which is holistically determined...

chapter

Perceptual similarity between audio clips and feature selection for its measurement

Qinghua Wu, Xiaolei Zhang, Ping Lv, Ji Wu

2012 8th International Symposium on Chinese Spoken Language Processing > 387 - 391

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

In this paper, we explore the retrieval of perceptually similar audio. It focuses on finding sounds according to human perceptions. Thus such retrieval is more “human-centered” [1] than previous audio retrievals which intend to find homologous sounds. We make comprehensive use of various acoustic features to measure the perceptual similarity. Since some acoustic features may be redundant or even adverse...

chapter

Perceptually-motivated assessment of automatically detected lexical stress in L2 learners' speech

Kun Li, Helen Meng

2012 8th International Symposium on Chinese Spoken Language Processing > 179 - 183

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

This paper presents a method of automatic lexical stress assessment for L2 English speech. Syllable stress can be labeled at three levels - primary (P), secondary (S) and no (N) stress, but secondary stress may vary among word pronunciations within and across accents and present difficulties for human perception. Hence, evaluation of lexical stress based on all three levels (i.e., the P-S-N criterion...

chapter

Are there brain regions related to speech perception? Evidence from a functional MRI study

HoJung Kang, Dong-Youl Kim, Jong-Hwan Lee

2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC) > 1100 - 1102

2012 IEEE International Conference on Systems, Man and Cybernetics - SMC

Previous studies have demonstrated that the left hemisphere was specialized in language function using functional magnetic resonance imaging (fMRI). On the other hand, some studies have revealed that the right hemisphere was related with language function. The hypotheses of this study were that (1) the regions related with language function have a bilateral functional network and (2) the level of...

chapter

Assessing User Bias in Affect Detection within Context-Based Spoken Dialog Systems

Syaheerah Lebai Lutfi, Fernando Fernandez-Martinez, Andres Casanova-Garcia, Lorena Lopez-Lebon, more

2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing > 893 - 898

2012 International Conference on Privacy, Security, Risk and Trust (PASSAT)

This paper presents an empirical evidence of user bias within a laboratory-oriented evaluation of a Spoken Dialog System. Specifically, we addressed user bias in their satisfaction judgements. We question the reliability of this data for modeling user emotion, focusing on contentment and frustration in a spoken dialog system. This bias is detected through machine learning experiments that were conducted...

chapter

An LVCSR Based Automatic Scoring Method in English Reading Tests

Junbo Zhang, Fuping Pan, Yonghong Yan

2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics > 1 > 34 - 37

2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC)

This paper describes a reading quality scoring system based on large vocabulary continuous speech recognition (LVCSR). Our previous scoring system was based on forced alignment. A disadvantage of forced alignment based system is it can hardly catch huge kinds of reading miscues, while LVCSR based system avoids this disadvantage. The most challenge was that the LVCSR recognition rate was low on our...

chapter

Post-processing of the recognized speech for web presentation of large audio archive

Marek Bohac, Karel Blavka, Michaela Kucharova, Svatava Skodova

2012 35th International Conference on Telecommunications and Signal Processing (TSP) > 441 - 445

2012 35th International Conference on Telecommunications and Signal Processing (TSP)

This paper deals with a post-processing phase of automatic transcription of spoken documents stored in the large Czech Radio audio archive (containing hundreds of thousands of recordings). The ultimate goal of the project is to transcribe them and to allow public access to their content. In this paper we focus on methods and algorithms for unsupervised post-processing of automatically recognized recordings...

chapter

Combined articulatory and auditory processing for improved speech recognition

Guangpu Huang, Meng Joo Er

2012 7th IEEE Conference on Industrial Electronics and Applications (ICIEA) > 972 - 977

2012 7th IEEE Conference on Industrial Electronics and Applications (ICIEA)

In this paper, we examined the feasibility of articulatory phonetic inversion (API) conditioned on the auditory qualities for improved speech recognition. And we introduced an efficient data-driven heuristic learning algorithm to capture the articulatory-phonetic features (APFs) of English speech. Then we reported the performance of the combined auditory and articulatory processing methods in the...

chapter

How different kinds of sound in videos can influence gaze

Guanghan Song, Denis Pellerin, Lionel Granjon

2012 13th International Workshop on Image Analysis for Multimedia Interactive Services > 1 - 4

2012 13th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)

This paper presents an analysis of the effect of thirteen different kinds of sound on visual gaze when looking freely at videos to help to predict eye positions. First, an audio-visual experiment was designed with two groups of participants, with audio-visual (AV) and visual (V) conditions, to test the sound effect. Then, an audio experiment was designed to validate the classification of sound we...

chapter

A novel approach to identify problematic call center conversations

Meghna Abhishek Pandharipande, Sunil Kumar Kopparapu

2012 Ninth International Conference on Computer Science and Software Engineering (JCSSE) > 1 - 5

2012 International Joint Conference on Computer Science and Software Engineering (JCSSE)

Voice based call centers enable customers to query for information by speaking to agents in the call center. Most often these call conversations are recorded for analysis with the intent of trying to identify things that can help improve the performance of the call center to serve the customer better. Today the recorded conversations are analyzed by humans by listening to call conversations, which...

chapter

A novel approach for emotion classification based on fusion of text and speech

Ali Houjeij, Layla Hamieh, Nader Mehdi, Hazem Hajj

2012 19th International Conference on Telecommunications (ICT) > 1 - 6

2012 19th International Conference on Telecommunications (ICT)

In this paper we design a system that adopts a novel approach for emotional classification from human dialogue based on text and speech context. Our main objective is to boost the accuracy of speech emotional classification by accounting for the features extracted from the spoken text. The proposed system concatenates text and speech features and feeds them as one input to the classifier. The work...

chapter

Performance of the OZU speaker verification systems with the NIST SRE 2010 data in a multi-class scenario

Fatih Yesil, Cenk Demiroglu

2012 20th Signal Processing and Communications Applications Conference (SIU) > 1 - 4

2012 20th Signal Processing and Communications Applications Conference (SIU)

Performance of the speaker verification systems is typically measured based on their binary decision accuracy. However, in speaker verification applications where close to %100 accuracy is required, such as the systems that are used in the call centers of finance companies, it is not possible to rely on the binary decisions of the existing verification systems. Still, in such cases, multi-class verification...

chapter

Classification of emotional content of sighs in dyadic human interactions

Rahul Gupta, Chi-Chun Lee, Shrikanth Narayanan

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2265 - 2268

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Emotions are an important part of human communication and are expressed both verbally and non-verbally. Common nonverbal vocalizations such as laughter, cries and sighs carry important emotional content in conversations. Sighs often are associated with negative emotion. In this work, we show that emotional sighs exist along both ends of the valence axis (positive-emotion vs. negative-emotion sighs)...

chapter

End-to-end speech recognition accuracy metric for voice-search tasks

Michael Levit, Shuangyu Chang, Bruce Buntschuh, Nick Kibre

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5141 - 5144

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

We introduce a novel metric for speech recognition success in voice search tasks, designed to reflect the impact of speech recognition errors on user's overall experience with the system. The computation of the metric is seeded using intuitive labels from human subjects and subsequently automated by replacing human annotations with a machine learning algorithm. The results show that search-based recognition...

chapter

Insights into machine lip reading

Yuxuan Lan, Richard Harvey, Barry-John Theobald

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4825 - 4828

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Computer lip-reading is one of the great signal processing challenges. Not only is the signal noisy, it is variable. However it is almost unknown to compare the performance with human lip-readers. Partly this is because of the paucity of human lip-readers and partly because most automatic systems only handle data that are trivial and therefore not representative of human speech. Here we generate a...

chapter

An adaboost-based weighting method for localizing human brain magnetic activity

T. Takiguchi, R. Takashima, Y. Ariki, T. Imada, more

Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference > 1 - 4

2012 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

This paper shows that pattern classification based on machine learning is a powerful tool for analyzing human brain activity data obtained by magnetoencephalography (MEG). In our previous work, a weighting method using multiple kernel learning was proposed, but this method had a high computational cost. In this paper, we propose a novel and fast weighting method using an AdaBoost algorithm to find...

article

Automatic Personality Perception: Prediction of Trait Attribution Based on Prosodic Features

Gelareh Mohammadi, Alessandro Vinciarelli

IEEE Transactions on Affective Computing > 2012 > 3 > 3 > 273 - 284

Whenever we listen to a voice for the first time, we attribute personality traits to the speaker. The process takes place in a few seconds and it is spontaneous and unaware. While the process is not necessarily accurate (attributed traits do not necessarily correspond to the actual traits of the speaker), still it significantly influences our behavior toward others, especially when it comes to social...

chapter

Jitter measurements for performance enhancement in the service sector

Agnes Jacob, P. Mythili

2011 Annual IEEE India Conference > 1 - 4

2011 Annual IEEE India Conference (INDICON)

In a leading service economy like India, services lie at the very center of economic activity. Competitive organizations now look not only at the skills and knowledge, but also at the behavior required by an employee to be successful on the job. Emotionally competent employees can effectively deal with occupational stress and maintain psychological well-being. This study explores the scope of the...

chapter

Speaker identification in smart environments with multilayer perceptron

Jasmina Novakovic

2011 19thTelecommunications Forum (TELFOR) Proceedings of Papers > 1418 - 1421

2011 19th Telecommunications Forum Telfor (TELFOR)

This paper presents reliability of MLP in speaker identification using characteristics extracted from their voices. Classification accuracy depends on speaking condition and varies up to 23% depending on the selected speaking condition. Results of simulation experiment show that MLP is effective in speaker identification, especially in the case of retelling and synchronous speech where we achieved...

chapter

A multimodal corpus for modeling turn management in multi-party conversations

H. Furukawa, M. Nishida, K. Jokinen, S. Yamamoto

2011 International Conference on Speech Database and Assessments (Oriental COCOSDA) > 142 - 146

2011 Oriental COCOSDA 2011 - International Conference on Speech Database and Assessments

Spoken interactions usually have accurate timing and alignment between interlocutors: turn-taking and topic flow are managed in a manner that provides conversational fluency and smooth progress of the interaction. Turn-taking and topic flow are also important in applications such as robot companions that interact with a user in real time. The creation of a multimodal conversational corpus for modeling...

Data set:
ieee
Keywords:
ACCURACY
HUMANS
SPEECH

Publication date

Set your own date range

Publication type

book (40)
article (2)

Keywords

SPEECH RECOGNITION (14)
FEATURE EXTRACTION (10)
SPEECH PROCESSING (8)
SUPPORT VECTOR MACHINES (7)
CORRELATION (6)
HIDDEN MARKOV MODELS (5)
ACOUSTICS (4)
COMPUTERS (4)
DATA MINING (4)
DECODING (4)
EMOTION RECOGNITION (4)
NATURAL LANGUAGE PROCESSING (4)
DATABASES (3)
NOISE (3)
SPEAKER RECOGNITION (3)
TRAINING (3)
AUDITORY SYSTEM (2)
AUTOMATIC SCORING (2)
CLASSIFICATION ALGORITHMS (2)
EDUCATIONAL INSTITUTIONS (2)
LEARNING (ARTIFICIAL INTELLIGENCE) (2)
MEASUREMENT (2)
MEL FREQUENCY CEPSTRAL COEFFICIENT (2)
MFCC (2)
NIST (2)
PREDICTIVE MODELS (2)
PROBABILITY DENSITY FUNCTION (2)
RADIO ACCESS NETWORKS (2)
RELIABILITY (2)
ROBOTS (2)
SILICON (2)
SPEAKER IDENTIFICATION (2)
STATISTICAL ANALYSIS (2)
STRESS (2)
VISUALIZATION (2)
3-D ACCELERATION SIGNALS (1)
ACOUSTIC AND PROSODIC MODELING (1)
ACOUSTIC MEASUREMENTS (1)
ACOUSTIC SIGNAL PROCESSING (1)
ACOUSTIC-PHONETIC MODELS (1)
ACTIVE APPEARANCE MODEL (1)
ADAPTATION (1)
ADAPTATION MODEL (1)
ADAPTATION MODELS (1)
ADAPTIVE INTERFACE (1)
ADAPTIVE SIGNAL PROCESSING (1)
ADAPTIVE SPEAKER IDENTIFICATION (1)
AFFECT (1)
AFFECT DETECTION (1)
AGENT GENDER (1)
ALGORITHMS (1)
ANSWER MESSAGES (1)
ANTHROPOMORPHIC AGENTS (1)
ARTIFICIAL INTELLIGENCE (1)
ARTIFICIAL NEURAL NETWORKS (1)
ATMOSPHERIC MEASUREMENTS (1)
AUDIO ARCHIVE (1)
AUDIO-VISUAL EMOTION PERCEPTION (1)
AUDIO-VISUAL FEATURE SELECTION (1)
AUTHENTICATION (1)
AUTOMATED LIP-READING (1)
AUTOMATED SCORING SYSTEM (1)
AUTOMATIC PERSONALITY PERCEPTION (1)
AUTOMATIC SPEECH RECOGNITION (1)
AUTOMATIC SPEECH-BASED MEASUREMENT (1)
AUTOMATICALLY ASSESSING ACOUSTIC MANIFESTATIONS (1)
AVATARS (1)
BIG FIVE (1)
BIG FIVE NEO-FFI (1)
BILATERAL-HEMISPHERE (1)
BIOCYBERNETICS (1)
BOOSTING (1)
BRAIN (1)
BUSINESS (1)
CALL (1)
CALL CONVERSATIONS (1)
CASCADED CLASSIFIER (1)
CHINESE NATIVE SPEAKERS (1)
CHINESE SEGMENTATION (1)
CHINESE TEXT (1)
CLASSIFICATION (1)
CLUSTERING ALGORITHMS (1)
COGNITION (1)
COGNITIVE LOAD (1)
COGNITIVE LOAD MEASUREMENT (1)
COMMUNICATION STYLE (1)
COMMUNICATION TOOL (1)
COMPACT CASCADED CLASSIFIER (1)
COMPUTER SIMULATED AVATARS (1)
CONDITIONAL RANDOM FIELD (1)
CONDITIONAL RANDOM FIELDS (1)
CONFERENCES (1)
CONSTRUCTION INDUSTRY (1)
CONTENT OF SPEECH (1)
CONTENT-BASED ANALYSIS (1)
CONTENTMENT (1)
CONVERSATIONAL FEATURES (1)
more

INFONA - science communication portal

Search results

Automatic scoring method considering quality and content of speech for scat Japanese speaking test

Perceptual similarity between audio clips and feature selection for its measurement

Perceptually-motivated assessment of automatically detected lexical stress in L2 learners' speech

Are there brain regions related to speech perception? Evidence from a functional MRI study

Assessing User Bias in Affect Detection within Context-Based Spoken Dialog Systems

An LVCSR Based Automatic Scoring Method in English Reading Tests

Post-processing of the recognized speech for web presentation of large audio archive

Combined articulatory and auditory processing for improved speech recognition

How different kinds of sound in videos can influence gaze

A novel approach to identify problematic call center conversations

A novel approach for emotion classification based on fusion of text and speech

Performance of the OZU speaker verification systems with the NIST SRE 2010 data in a multi-class scenario

Classification of emotional content of sighs in dyadic human interactions

End-to-end speech recognition accuracy metric for voice-search tasks

Insights into machine lip reading

An adaboost-based weighting method for localizing human brain magnetic activity

Automatic Personality Perception: Prediction of Trait Attribution Based on Prosodic Features

Jitter measurements for performance enhancement in the service sector

Speaker identification in smart environments with multilayer perceptron

A multimodal corpus for modeling turn management in multi-party conversations

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options