Search results for: Masato Akagi

Items from 1 to 13 out of 13 results

chapter

Voice conversion to emotional speech based on three-layered model in dimensional approach and parameterization of dynamic features in prosody

Yawen Xue, Yasuhiro Hamada, Masato Akagi

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 6

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

This paper proposes a system to convert neutral speech to emotional with controlled intensity of emotions. Most of previous researches considering synthesis of emotional voices used statistical or concatenative methods that can synthesize emotions in categorical emotional states such as joy, angry, sad, etc. While humans sometimes enhance or relieve emotional states and intensity during daily life,...

chapter

Emotional speech synthesis system based on a three-layered model using a dimensional approach

Yawen Xue, Yasuhiro Hamada, Masato Akagi

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 505 - 514

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

This paper proposes an emotional speech synthesis system based on a three-layered model using a dimensional approach. Most previous studies related to emotional speech synthesis using the dimensional approach focused on the relationship between acoustic features and emotion dimensions (valence and activation) only. However, people do not perceive emotion directly from acoustic features. Hence, the...

chapter

Toward improving estimation accuracy of emotion dimensions in bilingual scenario based on three-layered model

Xingfeng Li, Masato Akagi

2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE) > 21 - 26

2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE)

This paper proposes a newly revised three-layered model to improve emotion dimensions (valence, activation) estimation for bilingual scenario, using knowledge of commonalities and differences of human perception among multiple languages. Most of previous systems on speech emotion recognition only worked in each mono-language. However, to construct a generalized emotion recognition system which be...

chapter

Toward affective speech-to-speech translation: Strategy for emotional speech recognition and synthesis in multiple languages

Masato Akagi, Xiao Han, Reda Elbarougy, Yasuhiro Hamada, more

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 10

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Speech-to-speech translation (S2ST) is the process by which a spoken utterance in one language is used to produce a spoken output in another language. The conventional approach to S2ST has focused on processing linguistic information only by directly translating the spoken utterance from the source language to the target language without taking into account par-alinguistic and non-linguistic information...

chapter

A method for emotional speech synthesis based on the position of emotional state in Valence-Activation space

Yasuhiro Hamada, Reda Elbarougy, Masato Akagi

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 7

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Speech to Speech translation (S2ST) systems are very important for processing by which a spoken utterance in one language is used to produce a spoken output in another language. In S2ST techniques, so far, linguistic information has been mainly adopted without para- and non-linguistic information (emotion, individuality and gender, etc.). Therefore, this systems have a limitation in synthesizing affective...

chapter

Emotional Speech Recognition and Synthesis in Multiple Languages toward Affective Speech-to-Speech Translation System

Masato Akagi, Xiao Han, Reda Elbarougy, Yasuhiro Hamada, more

2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing > 574 - 577

2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP)

chapter

Blind method of estimating speech transmission index in room acoustics based on concept of modulation transfer function

Masashi Unoki, Tomohiro Ikeda, Kyohei Sasaki, Ryota Miyauchi, more

2013 IEEE China Summit and International Conference on Signal and Information Processing > 308 - 312

2013 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

The speech transmission index (STI) is an objective measurement that is used to assess the quality of speech transmission in room acoustics. This paper proposes a simplified method of blindly estimating the STI in room acoustics based on the concept of the modulation transfer function (MTF). STI can be estimated with this method in four steps: (1) MTF is estimated in the whole band from the reverberant...

chapter

Objective Japanese intelligibility prediction for noisy speech signals before and after noise-reduction processing

Junfeng Li, Masato Akagi, Yonghong Yan

2013 IEEE China Summit and International Conference on Signal and Information Processing > 352 - 355

2013 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

In this paper, we investigate eight objective speech intelligibility prediction measures for noisy signals before and after noise-reduction processing in Japanese. The Japanese speech signals were first corrupted by three types of noises at two signal-to-noise ratios and processed by four classes of noise-reduction algorithms, whose intelligibility was subsequently predicted by objective measures...

chapter

Transformation of F0 contours for lexical tones in concatenative speech synthesis of tonal languages

Trung-Nghia Phung, Mai Chi Luong, Masato Akagi

2012 International Conference on Speech Database and Assessments > 129 - 134

2012 Oriental COCOSDA 2012 - International Conference on Speech Database and Assessments

Concatenative speech synthesis (CSS) provides the greatest naturalness. However, it requires a huge stored database resulting a huge footprint. Reducing the capacity of stored database while preserving the quality of CSS, or improving the quality to size ratio (QSr), is still a challenge. In this paper, we propose a method of transforming fundamental frequency (F0) contours of lexical tones, developed...

chapter

Evaluation of objective intelligibility prediction measures for noise-reduced signals in mandarin

Risheng Xia, Junfeng Li, Masato Akagi, Yonghong Yan

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4465 - 4468

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

In this paper, the performance of eight state-of-the-art objective measures is evaluated in terms of predicting speech intelligibility in Mandarin of the processed signals by noise-reduction algorithms. The speech signals were first corrupted by three types of noises at two signal-to-noise ratios and subsequently processed by four classes of noise reduction algorithms, followed by objective intelligibility...

chapter

Speech emotion recognition system based on a dimensional approach using a three-layered model

Reda Elbarougy, Masato Akagi

Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference > 1 - 9

2012 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

This paper proposes a three-layer model for estimating the expressed emotions in a speech signal based on a dimensional approach. Most of the previous studies using the dimensional approach mainly focused on the direct relationship between acoustic features and emotion dimensions (valence, activation, and dominance). However, the acoustic features that correlate to valence dimension are less numerous,...

chapter

A concatenative speech synthesis for monosyllabic languages with limited data

Trung-Nghia Phung, Chi Luong Mai, Masato Akagi

Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference > 1 - 10

2012 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

Quality of unit-based concatenative speech synthesis is low while that of corpus-based concatenative speech synthesis with unit selection is great natural. However, unit selection requires a huge data for concatenation that reduces the range of its applications. In this paper, by using temporal decomposition for modeling contextual effects intra-syllable and inter-syllables, we propose a context-fitting...

chapter

Noise reduction using a small-scale microphone array in multi noise source environment

Masato Akagi, Takashi Kago

2002 IEEE International Conference on Acoustics, Speech, and Signal Processing > 1 > I-909 - I-912

Proceedings of ICASSP '02

To construct a front-end for ASR systems using a small-scale microphone array in real environments, robustness for unstable sudden-noises, multi-noises and near-field sound sources are required. This paper proposes a front-end method for enhancing target signals that subtracts estimated noise from noisy signals by using paired microphones in each sub-band. The proposed method assumes one integrated...

Filter options

Keywords:
SPEECH

Publication date

Set your own date range

Keywords

ACOUSTICS (7)
DATABASES (6)
FEATURE EXTRACTION (6)
SEMANTICS (6)
EMOTION RECOGNITION (4)
NOISE (3)
SPEECH RECOGNITION (3)
SPEECH SYNTHESIS (3)
CASCADING STYLE SHEETS (2)
CORRELATION (2)
HIDDEN MARKOV MODELS (2)
NOISE MEASUREMENT (2)
NOISE REDUCTION (2)
OBJECTIVE INTELLIGIBILITY PREDICTION (2)
SPEECH PROCESSING (2)
STANDARDS (2)
ACOUSTIC MEASUREMENTS (1)
AEROSPACE ELECTRONICS (1)
ARTIFICIAL NEURAL NETWORKS (1)
CONCATENATIVE (1)
CONTEXT (1)
CONTEXT MODELING (1)
EMOTION DIMENSIONS (1)
EMOTION RECOGNITION IN SPEECH (1)
EMOTION RECOGNITION/SYNTHESIS (1)
EQUATIONS (1)
ESTIMATION (1)
FREQUENCY MEASUREMENT (1)
FREQUENCY MODULATION (1)
FUZZY INFERENCE SYSTEM (FIS) (1)
FUZZY LOGIC (1)
INDEXES (1)
INTERPOLATION (1)
JAPANESE SPEECH INTELLIGIBILITY (1)
MANDARIN SPEECH INTELLIGIBILITY (1)
MATHEMATICAL MODEL (1)
MODULATION TRANSFER FUNCTION (1)
MULTIPLE LANGUAGES (1)
PARALINGUISTIC AND NON-LINGUISTIC INFORMATION (1)
PREDICTION ALGORITHMS (1)
PREDICTIVE MODELS (1)
PRODUCTION (1)
QUALITY TO SIZE RATIO (1)
REVERBERATION (1)
REVERBERATION TIME (1)
ROOM IMPULSE RESPONSE (1)
SPEECH TRANSMISSION INDEX (1)
SPEECH-TO-SPEECH TRANSLATION (S2ST) SYSTEM (1)
SYSTEM-ON-A-CHIP (1)
SYSTEM-ON-CHIP (1)
THREE-LAYERED MODEL (1)
TONE TRANSFORMATION (1)
TRAINING (1)
TRANSFORMS (1)
VECTORS (1)
more

INFONA - science communication portal

Search results for: Masato Akagi

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options