Search results for: . .

Items from 1 to 13 out of 13 results

chapter

Use of affect based interaction classification for continuous emotion tracking

Hossein Khaki, Engin Erzin

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2881 - 2885

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Natural and affective handshakes of two participants define the course of dyadic interaction. Affective states of the participants are expected to be correlated with the nature of the dyadic interaction. In this paper, we extract two classes of the dyadic interaction based on temporal clustering of affective states. We use the k-means temporal clustering to define the interaction classes, and utilize...

chapter

Line-wise text identification in comic books: A support vector machine-based approach

Srikanta Pal, J.Christophe Burie, Umapada Pal, Jean Marc Ogier

2016 International Joint Conference on Neural Networks (IJCNN) > 3995 - 4000

2016 International Joint Conference on Neural Networks (IJCNN)

This paper presents a study of line-wise text identification in comic books. Due to the unavailability of a single OCR system which can handle comic text of multiple scripts, the comic text identification based on script becomes an essential step for choosing the appropriate OCR. In this investigation, a new attempt has been made to explore a comic text identification technique of speech balloon to...

chapter

The influence of the speaker on indices of phonation

Imen Daly, Zied Hajaiej, Ali Garsallah

2016 2nd International Conference on Advanced Technologies for Signal and Image Processing (ATSIP) > 771 - 775

2016 2nd International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)

In this paper we present the method we have adopted in order to leave in search of the hearer, through different acoustic measurements which have been described as being influenced by the speaker included in the database TIMIT and ked TIMIT. This study allows us to better understand where there are relevant indices to discriminate the speakers and presentation criteria to distinguish a file giving...

chapter

Speech / music classification using Vocal Tract Constriction aspect of speech

Banriskhem K. Khonglah, S. R. Mahadeva Prasanna

2015 Annual IEEE India Conference (INDICON) > 1 - 6

2015 Annual IEEE India Conference (INDICON)

This work explores the vocal tract constriction aspect of speech for speech / music classification. During speech production, the vocal tract is closed for voiced bars and open for low vowels. For high vowels, semivowels, laterals, voiced fricatives and other sounds the vocal tract is in the intermediate position of the closed and open cases. Music signal, in particular the instrumental and non-vocal...

chapter

Bag-of-words representation for non-intrusive speech quality assessment

Qiaohong Li, Weisi Lin, Yuming Fang, Daniel Thalmann

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 616 - 619

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

Research on non-intrusive speech quality assessment (SQA) aims to develop a computational model simulating the human perception of speech signals accurately and automatically without any prior information about the reference clean speech signals. In this paper, we propose to learn a non-intrusive SQA metric based on bag-of-words (BoW) representation of speech signals. In particular, the proposed method...

chapter

Cross-language text-independent speaker identification

Geoffrey Durov, Frederic Jauquet

9th European Signal Processing Conference (EUSIPCO 1998) > 1 - 4

9th European Signal Processing Conference (EUSIPCO 1998)

In this paper, we investigate the influence of the language on the text-independent speaker recognition. For this purpose, we have used several automatic text-independent speaker recognition methods (Multivariable Auto-Regression, Vector Quantization and Histogram Classifiers). To measure the effect of the language, we have applied these methods on the POLY-COST 250 multi-language database. Among...

chapter

Nonlinear scale decomposition based features for visual speech recognition

Iain Matthews, J. Andrew Bangham, Richard Harvey, Stephen Cox

9th European Signal Processing Conference (EUSIPCO 1998) > 1 - 4

9th European Signal Processing Conference (EUSIPCO 1998)

A mathematical morphology based filter structure called a sieve is used to process mouth image sequences of a talker's mouth and form visual speech features. The effects of varying the type of filter, the post-processing and hidden Markov model (HMM) parameters on recognition accuracy are investigated using two audio-visual speech databases.

chapter

Recursive text segmentation for Indonesian Automated Document Reader for people with visual impairment

Teresa Vania Tjahja, Anto Satriyo Nugroho, James Purnama, Nur Aziza Azis, more

Proceedings of the 2011 International Conference on Electrical Engineering and Informatics > 1 - 6

2011 International Conference on Electrical Engineering and Informatics (ICEEI)

This research is conducted to accommodate the needs of visually impaired people through an intelligent system, which reads textual information on papers and produces corresponding voice. Indonesian Automated Document Reader (I-ADR) is operated via a voice-based user interface to scan a document page. Textual information from the scanned page is then extracted using Optical Character Recognition (OCR)...

chapter

Speaker authentication using video-based lip information

B Goswami, C Chan, J Kittler, W Christmas

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 1908 - 1911

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The lip-region can be interpreted as either a genetic or behavioural biometric trait depending on whether static or dynamic information is used. In this paper, we use a texture descriptor called Local Ordinal Contrast Pattern (LOCP) in conjunction with a novel spatiotemporal sampling method called Windowed Three Orthogonal Planes (WTOP) to represent both appearance and dynamics features observed in...

chapter

Local Ordinal Contrast Pattern histograms for spatiotemporal, lip-based speaker authentication

B Goswami, Chi Ho Chan, J Kittler, B Christmas

2010 Fourth IEEE International Conference on Biometrics: Theory, Applications and Systems (BTAS) > 1 - 6

2010 IEEE Fourth International Conference on Biometrics: Theory, Applications and Systems (BTAS 2010)

The lip-region can be interpreted as either a genetic or behavioral biometric trait depending on whether static or dynamic information is used. Despite this breadth of possible application as a biometric, lip-based biometric systems are scarcely developed in scientific literature compared to other more popular traits such as face or voice. This is because of the generalized view of the research community...

chapter

Noise robust voice activity detection using normal probability testing and time-domain histogram analysis

H Ghaemmaghami, D Dean, S Sridharan, I McCowan

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4470 - 4473

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper presents a method of voice activity detection (VAD) suitable for high noise scenarios, based on the fusion of two complementary systems. The first system uses a proposed non-Gaussianity score (NGS) feature based on normal probability testing. The second system employs a histogram distance score (HDS) feature that detects changes in the signal through conducting a template-based similarity...

chapter

The Vera am Mittag German audio-visual emotional speech database

M. Grimm, K. Kroschel, S. Narayanan

2008 IEEE International Conference on Multimedia and Expo > 865 - 868

2008 IEEE International Conference on Multimedia and Expo (ICME)

The lack of publicly available annotated databases is one of the major barriers to research advances on emotional information processing. In this contribution we present a recently collected database of spontaneous emotional speech in German which is being made available to the research community. The database consists of 12 hours of audio-visual recordings of the German TV talk show ldquoVera am...

chapter

Speech enhancement for non-stationary noise environment by adaptive wavelet packet

Sungwook Chang, Y. Kwon, Sung-il Yang, I-jae Kim

2002 IEEE International Conference on Acoustics, Speech, and Signal Processing > 1 > I-561 - I-564

Proceedings of ICASSP '02

We consider the non-stationary or colored noise estimation by wavelet thresholding method. First, we propose node dependent thresholding for adaptation in colored or non-stationary noise. Next, we suggest a noise estimation method based on spectral entropy using histogram of intensity instead of estimation method based on median absolute deviation (MAD). And we use a modified hard thresholding to...

Filter options

Keywords:
DATABASES
HISTOGRAMS
SPEECH
Publication type:
book

Publication date

Set your own date range

Keywords

FEATURE EXTRACTION (6)
SPEECH RECOGNITION (4)
SPEECH PROCESSING (3)
NOISE MEASUREMENT (2)
OPTICAL CHARACTER RECOGNITION SOFTWARE (2)
PIXEL (2)
SUPPORT VECTOR MACHINES (2)
ACOUSTIC MEASUREMENTS (1)
AUDIO DATABASES (1)
AUDIO RECORDING (1)
AUDIO-VISUAL RECORDING (1)
AUDIO-VISUAL SYSTEMS (1)
AUTOMATIC RECOGNITION OF SPEAKER (RAL) (1)
BAG OF WORDS (1)
BEHAVIORAL BIOMETRIC TRAIT (1)
BIOMETRICS (1)
BIOMETRICS (ACCESS CONTROL) (1)
CHARACTER RECOGNITION (1)
CODEBOOK CONSTRUCTION (1)
COMIC BOOK (1)
COMIC TEXT (1)
CORRELATION (1)
DATA ACQUISITION (1)
DECISION FUSION (1)
DYADIC INTERACTION TYPE (1)
EMOTION RECOGNITION (1)
EMOTIONAL SPEECH DATABASE (1)
ENTROPY (1)
ESTIMATION (1)
FACIAL EXPRESSION (1)
FLOORS (1)
GERMAN LANGUAGE (1)
GERMAN TV TALK SHOW VERA AM MITTAG (1)
HIDDEN MARKOV MODELS (1)
HISTOGRAM ANALYSIS (1)
HISTOGRAM DISTANCE SCORE FEATURE (1)
HISTOGRAM-DISTANCE BASED METHODS (1)
HUMAN-COMPUTER INTERACTION (1)
IDENTIFICATION TECHNIQUE (1)
IMAGE ENHANCEMENT (1)
IMAGE RECOGNITION (1)
IMAGE SEGMENTATION (1)
IMAGE TEXTURE (1)
INSTRUMENTS (1)
JESTKOD DATABASE (1)
JITTER (1)
KED TIMIT (1)
LINEAR DISCRIMINANT ANALYSIS (1)
LINGUISTICS (1)
LIP (1)
LIP-BASED SPEAKER AUTHENTICATION (1)
LOCAL ORDINAL CONTRAST PATTERN HISTOGRAMS (1)
MEASUREMENT (1)
MOUTH (1)
MULTIMODAL CONTINUOUS EMOTION RECOGNITION (1)
MULTIPLE SIGNAL CLASSIFICATION (1)
NATURAL LANGUAGE PROCESSING (1)
NATURAL LANGUAGE UNDERSTANDING (1)
NOISE (1)
NOISE ROBUST VOICE ACTIVITY DETECTION (1)
NON-INTRUSIVE QUALITY ASSESSMENT (1)
NONGAUSSIANITY SCORE FEATURE (1)
NORMAL PROBABILITY (1)
NORMAL PROBABILITY TESTING (1)
OCR (1)
OPEN-BY-RECONSTRUCTION FUSION STAGE (1)
PATTERN CLASSIFICATION (1)
PRODUCTION (1)
PROTOCOLS (1)
QUALITY ASSESSMENT (1)
SEGMENTED UTTERANCE (1)
SHAPE (1)
SIGNAL TO NOISE RATIO (1)
SPATIAL PYRAMID MATCHING (SPM) (1)
SPATIOTEMPORAL (1)
SPATIOTEMPORAL PHENOMENA (1)
SPEAKER RECOGNITION (1)
SPEAKER VERIFICATION ENGINES (1)
SPEECH / MUSIC CLASSIFICATION (1)
SPEECH ANALYSIS (1)
SPEECH BALLOON (1)
SPEECH CODING (1)
SPEECH QUALITY (1)
SPONTANEOUS EMOTIONAL SPEECH (1)
SPONTANEOUS SPEECH ANALYSIS (1)
STATISTICAL ANALYSIS (1)
SUPPORT VECTOR MACHINE (SVM) (1)
SUPPORT VECTOR REGRESSION (1)
SYNTHESIZERS (1)
TEMPLATE-BASED SIMILARITY MEASURE (1)
TEXT SEGMENTATION (1)
TEXT SUMMARIZATION (1)
TEXT-TO-SPEECH SYNTHESIZER (1)
TEXTURE REPRESENTATION (1)
TIME-DOMAIN ANALYSIS (1)
TIME-DOMAIN HISTOGRAM ANALYSIS (1)
TIMIT (1)
more

INFONA - science communication portal

Search results for: . .

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options