Masashi Unoki

chapter

Feasibility of vocal emotion conversion on modulation spectrogram for simulated cochlear implants

Zhi Zhu, Ryota Miyauchi, Yukiko Araki, Masashi Unoki

2017 25th European Signal Processing Conference (EUSIPCO) > 1834 - 1838

2017 25th European Signal Processing Conference (EUSIPCO)

Cochlear implant (CI) listeners were found to have great difficulty with vocal emotion recognition because of the limited spectral cues provided by CI devices. Previous studies have shown that the modulation spectral features of temporal envelopes may be important cues for vocal emotion recognition of noise-vocoded speech (NVS) as simulated CIs. In this paper, the feasibility of vocal emotion conversion...

chapter

Robust front-end for speech recognition by human and machine in noisy reverberant environments: The effect of phase information

Yang Liu, Naushin Nower, Shota Morita, Masashi Unoki

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

This paper proposes a robust front-end for speech applications based on restoration scheme of instantaneous amplitude and phase. Typical applications such as hearing aids and automatic speech recognition systems still have challenging issues with regard to robustness against noise and reverberation. The proposed front-end employed a combination of our previously proposed method for restoring instantaneous...

chapter

An Automatic Watermarking in CELP Speech Codec Based on Formant Tuning

Erick Christian Garcia Alvarez, Shengbei Wang, Masashi Unoki

2015 International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP) > 160 - 163

2015 International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP)

This paper proposes the unification of the codeexcited linear prediction (CELP) codec process with watermarking based on formant tuning. The serial problem in atermarking and then encoding with the CELP codec was thereby reduced by using the proposed method which also ncreased the bit detection rate. We took advantage of two key properties: I) humans do not perceive alterations applied to formants...

chapter

Restoration of instantaneous amplitude and phase of speech signal in noisy reverberant environments

Yang Liu, Naushin Nower, Yonghong Yan, Masashi Unoki

2015 23rd European Signal Processing Conference (EUSIPCO) > 879 - 883

2015 23rd European Signal Processing Conference (EUSIPCO)

We have proved that restoring the instantaneous amplitude as well as instantaneous phase on Gammatone interbank plays a significant role for speech enhancement. However, it is still challenging topic with dereverberation since previously proposed scheme can only work in noisy environments. In this paper, we extend our previously proposed scheme to be general speech enhancement of removing the effects...

chapter

Hybrid Speech Watermarking Based on Formant Enhancement and Cochlear Delay

Shengbei Wang, Masashi Unoki

2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing > 272 - 275

2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP)

Illegal use of digital technologies has brought a series of problems in speech protection and authorization. Digital watermarking can effectively solve these problems by embedding watermarks into the host signals. This paper proposes a hybrid watermarking method for speech signals based on the concepts of formant enhancement (FE) and cochlear delay (CD). This hybrid method utilizes the source-filter...

chapter

Watermarking Method for Speech Signals Based on Modifications to LSFs

Shengbei Wang, Masashi Unoki

2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing > 283 - 286

2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP)

We propose a method of speech watermarking based on modifications to line spectral frequencies (LSFs) of original speech. LSFs were derived from each frame with linear prediction (LP) analysis and watermarks were embedded into them by using the quantization index modulation (QIM) of different quantization steps. We took into consideration inaudibility and robustness that were influenced by minor modifications...

chapter

Study on Method for Estimating F0 of Steady Complex Tone in Noisy Reverberant Environments

Kenichiro Miwa, Masashi Unoki

2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing > 456 - 459

2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP)

This paper proposes a method for robustly and accurately estimating fundamental frequency (F0) of the steady complex tone on the basis of an amplitude modulation/demodulation technique. It is based on the well-known mechanism of pitch perception for AM tone. The comparative results revealed that the percentage correct rates of the estimated F0s using a few recent methods (TEMPO, PHIA, and CmpCep)...

chapter

IMM-based feature compensation robust to slowly time-varying noise and reverberation

Shin Jae Kang, Chang Woo Han, Kang Hyun Lee, Nam Soo Kim, more

2013 IEEE China Summit and International Conference on Signal and Information Processing > 313 - 317

2013 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

In this paper, we propose a novel feature compensation approach based on the interacting multiple model (IMM) algorithm specially designed for joint processing of background noise and acoustic reverberation. Our approach to cope with the time-varying environmental parameters is to establish a switching linear dynamic model for the additive and convo-lutive distortions in the log-spectral domain. The...

chapter

Blind method of estimating speech transmission index in room acoustics based on concept of modulation transfer function

Masashi Unoki, Tomohiro Ikeda, Kyohei Sasaki, Ryota Miyauchi, more

2013 IEEE China Summit and International Conference on Signal and Information Processing > 308 - 312

2013 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

The speech transmission index (STI) is an objective measurement that is used to assess the quality of speech transmission in room acoustics. This paper proposes a simplified method of blindly estimating the STI in room acoustics based on the concept of the modulation transfer function (MTF). STI can be estimated with this method in four steps: (1) MTF is estimated in the whole band from the reverberant...

article

Controlling Tradeoff Between Approximation Accuracy and Complexity of a Smooth Function in a Reproducing Kernel Hilbert Space for Noise Reduction

Xugang Lu, Masashi Unoki, Shigeki Matsuda, Chiori Hori, more

IEEE Transactions on Signal Processing > 2013 > 61 > 3 > 601 - 610

Noise reduction algorithms are widely used to mitigate noise effects on speech to improve the robustness of speech technology applications. However, they inevitably cause speech distortion. The tradeoff between noise reduction and speech distortion is a key concern in designing noise reduction algorithms. This study proposes a novel framework for noise reduction by considering this tradeoff. We regard...

chapter

Robust voice activity detection using empirical mode decomposition and modulation spectrum analysis

Yasuaki Kanai, Masashi Unoki

2012 8th International Symposium on Chinese Spoken Language Processing > 400 - 404

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

Voice activity detection (VAD) is used to detect speech/non-speech periods in observed signals. However, the current VAD technique has a serious problem in that the accuracy of detection of speech periods drastically reduces if it is used for noisy speech and/or for mixtures of speech/non-speech such as those in music and environmental sounds. Thus, VAD needs to be robust to enable speech periods...

chapter

Controlling the tradeoff property in a regularization framework for noise reduction

Xugang Lu, Masashi Unoki, Shigeki Matsuda, Chiori Hori, more

2012 8th International Symposium on Chinese Spoken Language Processing > 201 - 205

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

The tradeoff between noise reduction and speech distortion is a key concern in designing noise reduction algorithms. We have proposed a regularization framework for noise reduction with the consideration of the tradeoff problem. We regard speech estimation as a functional approximation problem in a reproducing kernel Hilbert space (RKHS). In the estimation, the objective function is formulated to...

chapter

Unified denoising and dereverberation method used in restoration of MTF-based power envelope

Masashi Unoki, Xugang Lu

2012 8th International Symposium on Chinese Spoken Language Processing > 215 - 219

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

Recent methods of speech enhancement have been proposed to suppress the effects of background noise and reverberation. The effect of background noise in these methods is regarded as additive and that of reverberation is convolutive. Therefore, methods of reducing noise and dereverberation have been applied separately in tandem. We previously unified the effects of noise and reverberation in the modulation...

chapter

Improvements to Creativity in Singing Abilities Based on Perspective of Studies on Interaction between Speech Production and Auditory Perception

Masashi Unoki, Kazushi Nishimoto

2012 Seventh International Conference on Knowledge, Information and Creativity Support Systems > 157 - 160

2012 7th International Conference on Knowledge, Information and Creativity Support Systems (KICSS)

Singing and speaking are important and natural ways in communications for humans to express nonlinguistic and linguistic information. It seems the majority of common people correctly perform and imitate all factors such as pitches and melodies as the same as those achieved by professional singers, while they can correctly vocalize all factors involved in speaking. There is no absolute answer as to...

chapter

Detection of Tampering in Speech Signals with Inaudible Watermarking Technique

Masashi Unoki, Ryota Miyauchi

2012 Eighth International Conference on Intelligent Information Hiding and Multimedia Signal Processing > 118 - 121

2012 Eighth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP)

There have recently been serious social issues involved in multimedia signal processing such as malicious attacks and tampering with digital audio/speech signals. Fragile speech watermarking is a technique that enables the detection of tampering with the original signals. We previously proposed an inaudible digital-audio watermarking approach based on cochlear delay. We investigated how the proposed...

chapter

Speech enhancement as a functional approximation and generalization

Xugang Lu, Masashi Unoki, Ryosuke Isotani, Hisashi Kawai, more

2010 7th International Symposium on Chinese Spoken Language Processing > 18 - 22

7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010)

Noise reduction is used to reduce the noise effect on speech, and is important for many real speech applications. However, noise reduction inevitably causes speech distortion. The trade-off between noise reduction and speech distortion is always a key concern in designing noise reduction algorithms. In this study, we took a new look at this problem, and regarded the speech estimation as a functional...

INFONA - science communication portal

Search results for: Masashi Unoki

Feasibility of vocal emotion conversion on modulation spectrogram for simulated cochlear implants

Robust front-end for speech recognition by human and machine in noisy reverberant environments: The effect of phase information

An Automatic Watermarking in CELP Speech Codec Based on Formant Tuning

Restoration of instantaneous amplitude and phase of speech signal in noisy reverberant environments

Hybrid Speech Watermarking Based on Formant Enhancement and Cochlear Delay

Watermarking Method for Speech Signals Based on Modifications to LSFs

Study on Method for Estimating F0 of Steady Complex Tone in Noisy Reverberant Environments

IMM-based feature compensation robust to slowly time-varying noise and reverberation

Blind method of estimating speech transmission index in room acoustics based on concept of modulation transfer function

Controlling Tradeoff Between Approximation Accuracy and Complexity of a Smooth Function in a Reproducing Kernel Hilbert Space for Noise Reduction

Robust voice activity detection using empirical mode decomposition and modulation spectrum analysis

Controlling the tradeoff property in a regularization framework for noise reduction

Unified denoising and dereverberation method used in restoration of MTF-based power envelope

Improvements to Creativity in Singing Abilities Based on Perspective of Studies on Interaction between Speech Production and Auditory Perception

Detection of Tampering in Speech Signals with Inaudible Watermarking Technique

Speech enhancement as a functional approximation and generalization

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results for: Masashi Unoki

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options