Search results

chapter

Robust voice activity detection using gammatone filtering and entropy

W. Q. Ong, A. W. C. Tan

2016 International Conference on Robotics, Automation and Sciences (ICORAS) > 1 - 5

2016 International Conference on Robotics, Automation and Sciences (ICORAS)

Voice activity detector (VAD) is used to detect the presence or absence of human voice in a signal. A robust VAD algorithm is essential to distinguish human voice in a noisy acoustic signal. There were many recent works in development of robust VAD which focus on unsupervised features extraction such as temporal variation, signal-to-noise ratio in [1] and etc. However, these methods are typically...

chapter

Real-time speaker identification system using cepstral features

Monalisha Barik, Susanta Kumar Sarangi, Sushanta Kumar Sahu

2016 2nd International Conference on Communication Control and Intelligent Systems (CCIS) > 89 - 93

2016 2nd International Conference on Communication, Control & Intelligent Systems (CCIS)

Real-time speaker identification (SI) system is the application of Biometric system where the voice samples are collected in real-time. Due to that contamination of noises in speaker samples are the natural scenario. In this work, we tried to increase the accuracy of real-time SI system. We analysed the SI system by using different feature extraction methods with GMM-ML classifier. We found that MFCC...

chapter

GFCC-based robust gender detection

M. A. Islam

2016 International Conference on Innovations in Science, Engineering and Technology (ICISET) > 1 - 4

2016 International Conference on Innovations in Science, Engineering and Technology (ICISET)

Gender classification technique is a part of the signal processing comprises with feature extraction and behavioural gender modelling. Fundamental frequency and pitch are mostly used as feature for gender detection due to their unique characteristics in voice source. In this study, Gammatone Frequency Cepstral Coefficient (GFCC)-based robust gender classification method has been presented. This study...

chapter

Frequency Domain Linear Prediction-based robust text-dependent speaker identification

M. A. Islam

2016 International Conference on Innovations in Science, Engineering and Technology (ICISET) > 1 - 4

2016 International Conference on Innovations in Science, Engineering and Technology (ICISET)

Speaker identification is a biometric technique of determining an unknown speaker's identity among a number of speakers using distinguish latent information of uttered speech. Crime investigation, security control, telephone banking and trading, and information reservation are some applications of this technique. Frequency Domain Linear Prediction (FDLP) is a time-frequency-based feature has been...

chapter

F₀ estimation of speech based on IRAPT using WLP-based TV-CAR analysis

Wei Shan, Keiichi Funaki

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 4

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

Fundamental frequency (F₀) estimation plays an important role in speech processing such as speech coding, synthesis, recognition and so on. Although a present F0 estimation method performs well under clean condition, the performance deteriorates significantly in noisy environment. For this reason robust F₀ estimation against additive noise is demanded. We have previously proposed F₀ estimation methods...

chapter

Interferences suppression using two closely-spaced microphones

Zhong-hua Fu

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

Interferences, especially competing speakers, significantly influence speech intelligibility. In this paper, we propose an interference suppression method using only two closely-spaced microphones. Firstly, a 1st-order hypercardioid differential microphone array (DMA) with white noise gain (WNG) constraint is designed in the STFT domain, which solves the amplification problem of DMA on incoherent...

chapter

On the impact of normalizing power-based features on robustness against noise for speech recognition

Hilman F. Pardede

2016 8th International Conference on Information Technology and Electrical Engineering (ICITEE) > 1 - 6

2016 8th International Conference on Information Technology and Electrical Engineering (ICITEE)

Many power based features have been proposed in previous studies as alternative to the conventional feature, i.e. MFCC, for speech recognition. These features are of interest because they are empirically shown to be more robust than MFCC in noisy environments. Some studies argue that the compressions of power functions which are less sensitive than log for low energy spectra is one of the reasons...

chapter

Continuous gesture recognition by using gesture spotting

Daeha Lee, Hosub Yoon, Jaehong Kim

2016 16th International Conference on Control, Automation and Systems (ICCAS) > 1496 - 1498

2016 16th International Conference on Control, Automation and Systems (ICCAS)

A gesture is not performed in only one action but a combination of continuous actions. It is very important to know the start and end of a gesture for accurate gesture recognition. In this paper, to extract a meaningful gesture portion in an online situation, we introduce a method that can distinguish the start and end of a gesture. Then, we describe the method of recognizing an extracted gesture.

chapter

HTL: Modelo para la extracción, estructuración y visualización de eventos médicos a partir de texto narrativo en historias clínicas electrónicas

Eddie Paul Hernandez Hernandez, Alexandra Pomares Quimbaya

2016 IEEE 11th Colombian Computing Conference (CCC) > 1 - 9

2016 IEEE 11th Colombian Computing Conference (CCC)

Las historias clínicas electrónicas contienen información importante de un paciente, que puede servir de insumo para realizar análisis retrospectivo en el diagnóstico, seguimiento y tratamiento de una enfermedad. Esta información es registrada de forma narrativa con lo que surge la limitación para identificar eventos médicos (tales como citas médicas, prescripción de medicamentos, tratamientos, procedimientos...

chapter

Mapping Mel sub-band energies using Deep belief network for robust speech recognition

Mojtaba Gholamipour, Babak Nasersharif

2016 8th International Symposium on Telecommunications (IST) > 510 - 514

2016 8th International Symposium on Telecommunications (IST)

Sub-band speech processing is well-known in robust speech recognition. On the other hand, in recent years, deep neural networks (DNNs) have been widely used in speech recognition for acoustic modeling and also feature extraction and transformation. In this paper, we propose to use deep belief network (DBN) as a post-processing method for de-noising in Mel sub-band level where we enhance logarithm...

chapter

Robust phonetic segmentation using multi-taper spectral estimation for noisy and clipped speech

Bhavik Vachhani, Chitralekha Bhat, Sunil Kopparapu

2016 24th European Signal Processing Conference (EUSIPCO) > 1343 - 1347

2016 24th European Signal Processing Conference (EUSIPCO)

Robust phonetic segmentation is extremely important for several speech processing tasks such as phone level articulation analysis and error detection, speech synthesis, and annotation. In this paper, we present an unsupervised phonetic segmentation approach and its application to noisy and clipped speech such as mobile phone recordings. We propose a multi-taper-based Perceptual Linear Prediction (PLP)...

chapter

Robust scoring of voice exercises in computer-based speech therapy systems

Mariana Diogo, Maxine Eskenazi, Joao Magalhaes, Sofia Cavaco

2016 24th European Signal Processing Conference (EUSIPCO) > 393 - 397

2016 24th European Signal Processing Conference (EUSIPCO)

Speech therapy is essential to help children with speech sound disorders. While some computer tools for speech therapy have been proposed, most focus on articulation disorders. Another important aspect of speech therapy is voice quality but not much research has been developed on this issue. As a contribution to fill this gap, we propose a robust scoring model for voice exercises often used in speech...

chapter

A retrieval algorithm for encrypted speech based on perceptual hashing

Huan Zhao, Shaofang He

2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) > 1840 - 1845

2016 12th International Conference on Natural Computation and 13th Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)

In order to further improve the robustness and discrimination of perceptual hashing and retrieval speed in large-scale data, a novel retrieval algorithm over encrypted speech is proposed. Before encrypted speech is uploaded, perceptual hashing sequences must be embedded as a digital watermark. In the process of generating perceptual hashing, multifractal characteristic of speech that has good distinctiveness...

chapter

Analysis of glottal source parameters in Parkinsonian speech

Jane Hanratty, Catherine Deegan, Mary Walsh, Barry Kirkpatrick

2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) > 3666 - 3669

2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)

Diagnosis and monitoring of Parkinson's disease has a number of challenges as there is no definitive biomarker despite the broad range of symptoms. Research is ongoing to produce objective measures that can either diagnose Parkinson's or act as an objective decision support tool. Recent research on speech based measures have demonstrated promising results. This study aims to investigate the characteristics...

chapter

Thin-film, high-density micro-electrocorticographic decoding of a human cortical gyrus

Leah Muller, Sarah Felix, Kedar G. Shah, Kye Lee, more

2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) > 1528 - 1531

2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)

High-density electrocorticography (ECoG) arrays are promising interfaces for high-resolution neural recording from the cortical surface. Commercial options for high-density arrays are limited, and historically tradeoffs must be made between spatial coverage and electrode density. However, thin-film technology is a promising alternative for generating electrode arrays capable of large area coverage...

chapter

Perception auditory factor for speaker recognition in noisy environment

Di Wu, Heming Zhao, Di Wu, Zhe Feng, more

2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) > 1916 - 1920

2016 12th International Conference on Natural Computation and 13th Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)

Speaker recognition plays an important role in speech processing and classification. In this paper, we propose a features extraction method using Perception Auditory Factor to improve the performance of speaker recognition in noisy environment. After the speech enhancement based on auditory perception characteristic and the 2-dimension enhancement for spectrogram, speech distribution is obtained from...

chapter

Speech signal analysis and enhancement based on HNM synthesis

Indrayani P. Suryawanshi, Chandraprabha A. Manjare

2016 International Conference on Computing Communication Control and automation (ICCUBEA) > 1 - 5

2016 International Conference on Computing Communication Control and automation (ICCUBEA)

A Speech enhancement which is the growth of communication system enhancement means improvement in value of quality of something. This paper explains harmonic noise model (HNM) and MMSE algorithm based speech improvement. There are many types of advantages like provides high quality speech synthesis, speech coding for flexible and effective decomposition of speech. So we use this method for speech...

chapter

Noise robust dysarthric speech classification using domain adaptation

Alan Wisler, Visar Berisha, Andreas Spanias, Julie Liss

2016 Digital Media Industry & Academic Forum (DMIAF) > 135 - 138

2016 Digital Media Industry & Academic Forum (DMIAF)

This paper will investigate viability of a screening application that could be used to identify individuals with Dysarthria from among a larger population using sentence-level speech data. This task presents a number of challenged particularly if we aim to identify the disorder in the earlier stages before the more significant symptoms have begun to manifest themselves. A principal challenge in this...

chapter

Multi-source localization based on approximated kernel density estimator and spatial likelihood function in near-field reverberant environment

Yu-Zhuo Fang, Xu Zhi-Yong, Zhao Zhao

2016 International Conference on Audio, Language and Image Processing (ICALIP) > 596 - 599

2016 International Conference on Audio, Language and Image Processing (ICALIP)

In order to cope with the multi-source localization in near-field reverberant environment, approximated kernel density estimator (KDE) algorithm is introduced to provide robust anti-reverberation performance and multi-stage (MS) is used to solve the spectrum aliasing of high frequency on account of wide spacing of microphone array. Then spatial likelihood function (SLF) is built to mix the pairwise...

chapter

Audio content analysis based on density of peaks in amplitude envelope

Tomasz Maka

2016 39th International Conference on Telecommunications and Signal Processing (TSP) > 331 - 334

2016 39th International Conference on Telecommunications and Signal Processing (TSP)

This paper presents an approach to audio parameterization using properties of the peaks detected in the amplitude envelope. The proposed solution based on observation that abrupt changes in the envelope of signal are connected with type of audio signal. For this purpose we used the density properties of peaks to calculate the feature vectors. The extraction process exploits an amplitude envelope estimation...

INFONA - science communication portal

Search results

Robust voice activity detection using gammatone filtering and entropy

Real-time speaker identification system using cepstral features

GFCC-based robust gender detection

Frequency Domain Linear Prediction-based robust text-dependent speaker identification

F₀ estimation of speech based on IRAPT using WLP-based TV-CAR analysis

Interferences suppression using two closely-spaced microphones

On the impact of normalizing power-based features on robustness against noise for speech recognition

Continuous gesture recognition by using gesture spotting

HTL: Modelo para la extracción, estructuración y visualización de eventos médicos a partir de texto narrativo en historias clínicas electrónicas

Mapping Mel sub-band energies using Deep belief network for robust speech recognition

Robust phonetic segmentation using multi-taper spectral estimation for noisy and clipped speech

Robust scoring of voice exercises in computer-based speech therapy systems

A retrieval algorithm for encrypted speech based on perceptual hashing

Analysis of glottal source parameters in Parkinsonian speech

Thin-film, high-density micro-electrocorticographic decoding of a human cortical gyrus

Perception auditory factor for speaker recognition in noisy environment

Speech signal analysis and enhancement based on HNM synthesis

Noise robust dysarthric speech classification using domain adaptation

Multi-source localization based on approximated kernel density estimator and spatial likelihood function in near-field reverberant environment

Audio content analysis based on density of peaks in amplitude envelope

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options