Advanced search

Advanced search in people

From:

To:

Items from 1 to 20 out of 35 results

chapter

Throat microphone speech recognition using mfcc

Amritha Vijayan, Bipil Mary Mathai, Karthik Valsalan, Riyanka Raji Johnson, more

2017 International Conference on Networks & Advances in Computational Technologies (NetACT) > 392 - 395

2017 International Conference on Networks & Advances in Computational Technologies (NetACT)

The Throat Microphone (TM) is a non-acoustic device, relying on the vibrations of vocal folds rather than the audible sound produced. Correctly capturing vocal fold vibrations is difficult due to poor signal representation capabilities. The system recognizes the TM vibrations and produces the corresponding speech sound. This is done by extracting features from the spectrum of the TM vibrations and...

chapter

Software development for the speech signals analysis

Daria V. Borovikova, Vladimir K. Makukha

2017 18th International Conference of Young Specialists on Micro/Nanotechnologies and Electron Devices (EDM) > 622 - 625

2017 18th International Conference of Young Specialists on Micro/Nanotechnologies and Electron Devices (EDM)

The object is to develop a program for the analysis of speech designed to help speech therapists and phoniatrics at work. There are functions of recording and voice analysis, as well as implementation of the ability to add information about patients in the database (DB)In this program.

chapter

A new speech corpus in Spanish for speaker verification

N. Garcia, T. Arias-Vergara, J. R. Orozco-Arroyave, J. F. Vargas-Bonilla

2016 XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA) > 1 - 7

2016 XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA)

In this paper we present a new database with speech recordings in Spanish. The database contains recordings of 54 native Spanish speakers. It is appropriate to be used in the development and testing of better Speaker Verification systems. The recording procedure, equipments and speech tasks are detailed. Experiments using the GMM-UBM speaker verification methodology were performed. The methodology...

chapter

Who spoke what? A latent variable framework for the joint decoding of multiple speakers and their keywords

Harshavardhan Sundar, Thippur V. Sreenivas

2016 International Conference on Signal Processing and Communications (SPCOM) > 1 - 5

2016 International Conference on Signal Processing and Communications (SPCOM)

In this paper, we present a latent variable (LV) framework to identify all the speakers and their keywords given a single channel microphone recording containing a multi-speaker mixture signal. We introduce two separate LVs to denote active speakers and the keywords uttered. The dependency of a spoken keyword on the speaker is modeled through a conditional probability mass function. The distribution...

chapter

Binaural wind noise detection, cancellation and its evaluation for hearing aids based on HRTF cues

Hidetoshi Nakashima, Ryousuke Kouyama, Nobuhiko Hiruma, Yoh-ichi Fujisaka

IECON 2015 - 41st Annual Conference of the IEEE Industrial Electronics Society > 4896 - 4899

IECON 2015 - 41st Annual Conference of the IEEE Industrial Electronics Society

Wind noise is one of the most significant issues for hearing aid users. In this paper, a contribution to this issue is made by using binaural phase and level difference. Most of sounds including speech signal have a directional information, that is, interaural phase difference (IPD) and level difference (ILD) are not varied if sound direction is fixed. However, wind noise have no directional information,...

chapter

A real-world recording database for ad hoc microphone arrays

William S. Woods, Elior Hadad, Ivo Merks, Buye Xu, more

2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 1 - 5

2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

We report on a recently-recorded database for use in processing of ad hoc microphone constellations. Twenty-four microphones were positioned in various locations at a central table in a large room, and their outputs were recorded while 4 target talkers at the table both read from a list of sentences in a constrained way and also maintained a natural conversation for several minutes. This was done...

chapter

Speech database acquisition for assisted living environment applications

Mihai Dogariu, Horia Cucu, Andi Buzo, Dragos Burileanu, more

2015 International Conference on Speech Technology and Human-Computer Dialogue (SpeD) > 1 - 6

2015 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)

Home automation has become a subject of increasing interest for both industry and research as there is an increase in the awareness of such systems and their benefits can be easily seen. The new trend is to develop smart homes where commands can be given by speech. This way of communication, besides being the most natural, has the advantage of offering flexibility to the users especially when they...

chapter

Significance of implementing polarity detection circuits in audio preamplifiers

B. Deepak, D. Govind

2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 2197 - 2200

2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

The reversal of the current directions in audio circuit elements causes polarity inversion of the acquired audio signal with respect to the reference input signal. The objective of the work presented in this paper is to implement a simple polarity detection circuit in audio preamplifiers which provides an indication of the signal polarity inversion. The present work also demonstrates the possibilities...

chapter

A new approach to dereverberation and noise reduction with microphone arrays

J. L. Sanchez-Bote, J. Gonzalez-Rodriguez, J. Ortega-Garcia

2000 10th European Signal Processing Conference > 1 - 4

2000 10th European Signal Processing Conference

In this paper the speech enhancement abilities of a new array-based processor have been tested. The proposed system works in three cascade stages. First, the signals are time aligned with the estimated direction of the desired sound source. Second, the signal is decomposed in its allpass and minimum-phase components using cepstral processing. In this moment, beamforming and liftering in cepstral domain...

chapter

/r/-Letter disorder diagnosis (/r/-LDD): Arabic speech database development for automatic diagnosis of childhood speech disorders (Case study)

Nacereddine Hammami, Mouldi Bedda, Nadir Farah, Sihem Mansouri

2015 Intelligent Systems and Computer Vision (ISCV) > 1 - 7

2015 Intelligent Systems and Computer Vision (ISCV)

In light of the scarcity of both published and free Acoustic Arabic databases, we propose in this paper Acoustic Arabic database to be a reference in the field of automatic Arabic speech recognition, this database is the result of a case study that has been developed to contribute to the automatic diagnosis of speech disorders in Arabic speaking children, the field work was in collaboration with experts...

chapter

Measurement, analysis and simulation of wind noise signals for mobile communication devices

Christoph Matthias Nelke, Peter Vary

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC) > 327 - 331

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC)

In this contribution, we study the characteristics of sound generated by wind and a signal model for the synthesis of wind noise signals is derived. An analysis of the statistics of wind noise recorded in a laboratory setup is carried out with respect to the spectral and temporal properties of the signals. In particular, an autoregresive model is developed for the spectral shape description and the...

chapter

Multichannel audio database in various acoustic environments

Elior Hadad, Florian Heese, Peter Vary, Sharon Gannot

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC) > 313 - 317

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC)

In this paper we describe a new multichannel room impulse responses database. The impulse responses are measured in a room with configurable reverberation level resulting in three different acoustic scenarios with reverberation times RT₆₀ equals to 160 ms, 360 ms and 610 ms. The measurements were carried out in recording sessions of several source positions on a spatial grid (angle range of −90° to...

chapter

The single- and multichannel audio recordings database (SMARD)

Jesper Kjaer Nielsen, Jesper Rindom Jensen, Soren Holdt Jensen, Mads Graesboll Christensen

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC) > 40 - 44

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC)

A new single- and multichannel audio recordings database (SMARD) is presented in this paper. The database contains recordings from a box-shaped listening room for various loudspeaker and array types. The recordings were made for 48 different configurations of three different loudspeakers and four different microphone arrays. In each configuration, 20 different audio segments were played and recorded...

chapter

Novel windowing technique of MFCC for speaker identification with Modified Polynomial Classifiers

Aarti Bakshi, Sunil Kumar Kopparapu, Sanjay Pawar, Shikha Nema

2014 5th International Conference - Confluence The Next Generation Information Technology Summit (Confluence) > 292 - 297

2014 5th International Conference- Confluence The Next Generation Information Technology Summit

Speech is one of the most popular parameter used to identify a speaker by her spoken phrase. Feature extraction from speech is a necessary first step in a speaker identification process. Traditionally computation of the Mel Frequency Cepstral Coefficient (MFCC) features use hamming window, as a preprocessing step to reduce spectral leakages. However, hamming window results in reasonable side lobes...

chapter

Multi channel reverberant speech enhancement using LP residual cepstrum

Karan Nathwani, Harish Padaki, Rajesh M. Hegde

2013 Asilomar Conference on Signals, Systems and Computers > 555 - 559

2013 Asilomar Conference on Signals, Systems and Computers

In this work, a method for multi channel speech enhancement using linear prediction (LP) residual cepstrum is proposed. The method performs deconvolution at each microphone output using cepstral domain. The deconvolution of acoustic impulse response from reverberated signal in each individual channel removes early reverberation. This dereverberated output from each channel is then spatially filtered...

chapter

KSU Speech Database: Text Selection, Recording and Verification

Mansour Alsulaiman, Zulfiqar Ali, Ghulam Muhammed, Mohamed Bencherif, more

2013 European Modelling Symposium > 237 - 242

2013 European Modelling Symposium (EMS)

King Saud University speech database (KSU-DB) is a very rich speech database of Arabic language. Its richness is in many dimensions. It has more than three hundred speakers of both genders. The speakers are Arabs and non-Arabs belonging to twenty-nine different nationalities. The database has different types of text such as isolated words, digits, phonetically rich words and sentences, phonetically...

chapter

Video-informed approach for enhancing audio source separation through noise source suppression

Jack Harris, Bertrand Rivet, Syed Mohsen Naqvi, Jonathon A. Chambers, more

2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP) > 1 - 6

2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP)

This paper describes a method where an interference noise source within an audio source separation scenario is suppressed from a mixture. The principal idea of the proposed method is to use a video camera array for locating a interference noise source whose 3D position will be used to estimate a matrix of frequency responses (FRs) by linearly combining a series of previously known FRs. A filter is...

chapter

Fundamental research on a singing training support system for Shigin: Japanese traditional singing

Masashi Nakayama

2013 Proceedings of IEEE Southeastcon > 1 - 6

IEEE SOUTHEASTCON 2013

Shigin is the singing of Japanese or Chinese poetry, following a melody called “seicho” in Japanese. However, it is difficult to master Shigin because a trainer teaches according to his/her own impressions, and its melody employs a relative music scale. Therefore, this paper proposes a singing training support system for Shigin that clarifies differences in signal characteristics between a trainee...

article

A Multimodal Approach to Speaker Diarization on TV Talk-Shows

Félicien Vallet, Slim Essid, Jean Carrive

IEEE Transactions on Multimedia > 2013 > 15 > 3 > 509 - 520

In this article, we propose solutions to the problem of speaker diarization of TV talk-shows, a problem for which adapted multimodal approaches, relying on other streams of data than only audio, remain largely under exploited. Hence we propose an original system that leverages prior knowledge on the structure of this type of content, especially the visual information relating to the active speakers,...

chapter

Online learning for template-based multi-channel ego noise estimation

Gkhan Ince, Kazuhiro Nakadai, Keisuke Nakamura

2012 IEEE/RSJ International Conference on Intelligent Robots and Systems > 3282 - 3287

2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2012)

This paper presents a system that gives a robot the ability to diminish its own disturbing noise (i.e., ego noise) by utilizing template-based ego noise estimation, an algorithm previously developed by the authors. In pursuit of an autonomous, online and adaptive template learning system in this work, we specifically focus on eliminating the requirement of an offline training session performed in...

Data set:
ieee
Keywords:
DATABASES
SPEECH
MICROPHONES

Publication date

Set your own date range

Publication type

book (33)
article (2)

Keywords

SPEAKER RECOGNITION (11)
SPEECH RECOGNITION (9)
FEATURE EXTRACTION (7)
SPEECH PROCESSING (7)
TRAINING (7)
ACOUSTICS (6)
NOISE (6)
REVERBERATION (6)
MEL FREQUENCY CEPSTRAL COEFFICIENT (5)
SIGNAL TO NOISE RATIO (4)
ACCURACY (3)
ARRAYS (3)
COMPUTERS (3)
CONFERENCES (3)
DATABASE (3)
EDUCATIONAL INSTITUTIONS (3)
MATHEMATICAL MODEL (3)
PERFORMANCE EVALUATION (3)
ROBOTS (3)
SPEAKER IDENTIFICATION (3)
SPEECH ENHANCEMENT (3)
VOCABULARY (3)
ACOUSTIC MEASUREMENTS (2)
ARABIC SPEECH DATABASE (2)
ARRAY SIGNAL PROCESSING (2)
AUDITORY SYSTEM (2)
CEPSTRAL ANALYSIS (2)
CEPSTRUM (2)
CONTROL SYSTEMS (2)
DATA ACQUISITION (2)
DATA MODELS (2)
ELECTRONIC MAIL (2)
GAUSSIAN PROCESSES (2)
JAPANESE TRADITIONAL SONG (2)
MFCC (2)
NOISE MEASUREMENT (2)
ROBUSTNESS (2)
ROOM IMPULSE RESPONSE (2)
SET THEORY (2)
SHIGIN (2)
SHIGIN MELODY (2)
SIGNAL PROCESSING (2)
SINGING TRAINING (2)
SPEAKER VERIFICATION (2)
SPEECH RECORDING (2)
TELEPHONE SETS (2)
TESTING (2)
TIME FREQUENCY ANALYSIS (2)
TIME-FREQUENCY ANALYSIS (2)
TRAINING DATA (2)
VECTORS (2)
AACHEN IMPULSE RESPONSE DATABASE (1)
ACOUSTIC PROPERTIES (1)
ADAPTATION MODEL (1)
ADDITIVE NOISE (1)
AIR DATABASE (1)
ALMOST-ANECHOIC ROOM (1)
AMERICAN ENGLISH ACOUSTIC MODELS (1)
APPROXIMATION METHODS (1)
ARABIC SPEECH (1)
ARABIC SPEECH RECOGNITION (1)
ARCHITECTURAL ACOUSTICS (1)
ARTIFICIAL EARS (1)
ARTIFICIAL INTELLIGENCE (1)
ARTIFICIAL NEURAL NETWORKS (1)
ASSOCIATIVE MEMORY (1)
AUDIO DATABASE (1)
AUDIO DATABASES (1)
AUDIO RECORDING (1)
AUDIO-VISUAL SOURCE SEPARATION (1)
AUSTRALIAN IN-CAR ENGLISH SPEECH CORPUS (1)
AUTOMATIC ACQUISITION DEVICE IDENTIFICATION (1)
AUTOMATIC DIAGNOSIS OF SPEECH DISORDERS (1)
AUTOMATIC ESTIMATION (1)
AUTOMATIC SPEAKER RECOGNITION (1)
AUTOMATIC SPEECH RECOGNITION (1)
AUTOMATIC TREATMENT OF SPEECH DISORDERS (1)
AUTOMATION (1)
BASIS FILTERS (1)
BEAMFORMING (1)
BILINGUAL SPEECH (1)
BINARY TREES (1)
BINAURAL (1)
BINAURAL CONTEXT (1)
BINAURAL HEARING (1)
BINAURAL MEASUREMENTS (1)
BINAURAL ROOM IMPULSE RESPONSE DATABASE (1)
BIOMETRICS (ACCESS CONTROL) (1)
CAMERAS (1)
CHANNEL EQUALIZATION (1)
CHEMICAL ENGINEERING (1)
CHILDHOOD SPEECH DISORDERS (1)
COHERENCE (1)
COHERENCE-BASED DEREVERBERATION ALGORITHM (1)
COMMUNICATION CHANNELS (1)
COMMUNICATION QUALITY ASSESSMENT (1)
COMPENSATION (1)
more

INFONA - science communication portal

Advanced search

Advanced search in people

Throat microphone speech recognition using mfcc

Software development for the speech signals analysis

A new speech corpus in Spanish for speaker verification

Who spoke what? A latent variable framework for the joint decoding of multiple speakers and their keywords

Binaural wind noise detection, cancellation and its evaluation for hearing aids based on HRTF cues

A real-world recording database for ad hoc microphone arrays

Speech database acquisition for assisted living environment applications

Significance of implementing polarity detection circuits in audio preamplifiers

A new approach to dereverberation and noise reduction with microphone arrays

/r/-Letter disorder diagnosis (/r/-LDD): Arabic speech database development for automatic diagnosis of childhood speech disorders (Case study)

Measurement, analysis and simulation of wind noise signals for mobile communication devices

Multichannel audio database in various acoustic environments

The single- and multichannel audio recordings database (SMARD)

Novel windowing technique of MFCC for speaker identification with Modified Polynomial Classifiers

Multi channel reverberant speech enhancement using LP residual cepstrum

KSU Speech Database: Text Selection, Recording and Verification

Video-informed approach for enhancing audio source separation through noise source suppression

Fundamental research on a singing training support system for Shigin: Japanese traditional singing

A Multimodal Approach to Speaker Diarization on TV Talk-Shows

Online learning for template-based multi-channel ego noise estimation

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Advanced search

Advanced search in people

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options