Search results

Items from 101 to 120 out of 970 results

1 ...
3
4
5
6
7
8
9

chapter

Speech/Music Classification of Short Audio Segments

Toni Hirvonen

2014 IEEE International Symposium on Multimedia > 135 - 138

2014 IEEE International Symposium on Multimedia (ISM)

Research on speech/music classification of digital audio has been both popular in academia, and increasingly utilized in industry. Most of the usual methods use carefully hand-crafted features with Gaussian Mixture Models. To get best performance, some of the features necessitate a long latency due to look ahead, or/and a long onset error. This paper aims to have a different approach to the problem...

chapter

Design of a POS tagger using conditional random fields for Malayalam

V. Krishnapriya, P. Sreesha, T. R. Harithalakshmi, T. C. Archana, more

2014 First International Conference on Computational Systems and Communications (ICCSC) > 370 - 373

2014 First International Conference on Computational Systems and Communications (ICCSC)

Parts of Speech tagging, is a process of marking the words in a text as corresponding to a particular part of speech, based on its definition and context POS tagger plays an important role in Natural language applications like speech recognition, natural language parsing, information retrieval and extraction. This paper discusses architecture for designing a Part-Of-Speech (POS tagger for Malayalam...

chapter

Pronominal anaphora resolution using salience score for Malayalam

S. Athira, T S Lekshmi, R R Rajeev, Elizabeth Sherly, more

2014 First International Conference on Computational Systems and Communications (ICCSC) > 47 - 51

2014 First International Conference on Computational Systems and Communications (ICCSC)

Anaphora resolution (AR) is the process of resolving references to an entity in the discourse. The paper presents an algorithm to identify the pronominals and its antecedents in the Malayalam text input. Anaphora resolution is achieved by employing a hybrid of statistical machine learning and rule based approaches. The system is implemented by exploiting the morphological richness of the language...

chapter

Analysis of ElectroGlottoGraph signal using Ensemble Empirical Mode Decomposition

Rajib Sharma, Ramesh K., S. R. M. Prasanna

2014 Annual IEEE India Conference (INDICON) > 1 - 6

2014 Annual IEEE India Conference (INDICON)

The analysis of various components of the Electroglottograph (EGG) signal, obtained after Ensemble Empirical Mode Decomposition (EEMD) is the primary objective of this paper. The ability of EEMD to detect intermittent high frequency data embedded in the data of lower frequency is exploited to segregate the Epoch locations and the Periodic nature of EGG signal. The dyadic filterbank property of EEMD...

chapter

2D semi-NMF of scale-frequency map for environmental sound classification

Wen-Chi Hsieh, Chin-Wen Ho, Viet-Hang Duong, Yuan-Shan Lee, more

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 4

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

This paper introduces a novel two dimensional feature extraction method for environmental sound classification, based on two dimensional semi-nonnegative matrix factorization (2D Semi-NMF) of scale-frequency maps. We first extract scale-frequency maps (SFMs) from the input signals, and this feature is considered preserving scale and frequency characteristics of signals. Second, a 2D Semi-NMF method...

chapter

The use of semantic and acoustic features for open-domain TED talk summarization

Fajri Koto, Sakriani Sakti, Graham Neubig, Tomoki Toda, more

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 4

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In this paper, we address the problem of automatic speech summarization on open-domain TED talks. The large vocabulary and diversity of topics from speaker-to-speaker presents significant difficulties. The challenges increase not only how to handle disfluencies and fillers, but also how to extract topic-related meaningful messages within the free talks. Here, we propose to incorporate semantic and...

chapter

Unnecessary utterance detection for avoiding digressions in discussion

Riki Yoshida, Takuya Hiraoka, Graham Neubig, Sakriani Sakti, more

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 4

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In this paper, we propose a method for avoiding digressions in discussion by detecting unnecessary utterances and having a dialogue system intervene. The detector is based on the features using word frequency and topic shifts. The performance (i.e. accuracy, recall, precision, and F-measure) of the unnecessary utterance detector is evaluated through leave-one-dialogue-out cross-validation. In the...

chapter

Analysis of customer communication by employee in restaurant and lead time estimation

Masanori Takehara, Hiroya Nojiri, Satoshi Tamura, Satoru Hayamizu, more

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 5

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Human behavior sensing and their analysis are great role to improve service quality and education of employees. This paper shows novel frameworks of detection of customer communication and lead time estimation(LTE) by using multi-sensored data, sound data and accounting data in the restaurant. They are useful for management about work environments and problems for employees. Lead time from order to...

chapter

Domain specific audio indexing using linguistic information

L. Pandey, K. Nathwani, S. Kaur, I. Husain, more

2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) > 364 - 369

2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)

In this paper a novel methodology for indexing domain specific audio archives using linguistic information present in the speech signal is discussed. The audio indexing system is phone based and can work under limited training data conditions. A training data set that captures the linguistic information within Hindi language at the syllable level is first developed. A reduced phone set is then derived...

chapter

A reliable speaker verification system based on LPCC and DTW

Rekha Nair, Nirmala Salam

2014 IEEE International Conference on Computational Intelligence and Computing Research > 1 - 4

2014 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC)

Human voice can serve as a password/key for access to various services. This voice is used for verifying speaker in speaker verification system based on the features extracted from the voice signal. In automated speaker verification the speaker's voice signal is processed to extract speaker-specific information which is used to generate voiceprint also known as a template that cannot be replicated...

chapter

Feature extraction using Spectral Centroid and Mel Frequency Cepstral Coefficient for Quranic Accent Automatic Identification

Noraziahtulhidayu Kamarudin, S.A.R Al-Haddad, Shaiful Jahari Hashim, Mohammad Ali Nematollahi, more

2014 IEEE Student Conference on Research and Development > 1 - 6

2014 IEEE Student Conference on Research and Development (SCOReD)

This paper presents the process of Quranic Accent Automatic Identification. Recent feature extraction technique that is used for Quranic verse rule identification/Tajweed include Mel Frequency Cepstral Coefficients (MFCC) which prone to additive noise and may reduce the classification result. Therefore, to improve the performance of MFCC with addition of Spectral Centroid features and is proposed...

chapter

Query sound-by-example video retrieval framework

Issam Feki, Anis Ben Ammar, Adel M. Alimi

2014 14th International Conference on Hybrid Intelligent Systems > 297 - 302

2014 14th International Conference on Hybrid Intelligent Systems (HIS)

In this paper, query sound-by-example video retrieval framework based on audio concepts is presented. First, audio stream extracted from movies in the database is set into orientation clusters using an unsupervised segmentation technique. Audio signals admit a new proposed particular pretreatment process to distinguish audio concepts. This is used for indexing the video data. Second, the query asked...

chapter

Optimal selection of electrocorticographic sensors for voice activity detection

Vasileios G. Kanas, Iosif Mporas, Heather L. Benz, Kyriakos N. Sgarbas, more

2014 13th International Conference on Control Automation Robotics & Vision (ICARCV) > 29 - 32

2014 13th International Conference on Control Automation Robotics & Vision (ICARCV)

An effective speech brain machine interface requires selecting the best cortical recording sites and signal features for decoding speech production, but also minimal clinical risk for the patient. Motivated by this need to reduce patient risk, the purpose of this study is to detect voice activity (speech onset and offset) automatically from spatial-spectral features of electrocorticographic signals...

chapter

Classification of emphatic consonants and their counterparts in Modern Standard Arabic using neural networks

Yasser M. Seddiq, Yousef A. Alotaibi, Sid-Ahmed Selouani

2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) > 73 - 77

2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)

This paper presents the work of acoustic analysis related to Modern Standard Arabic (MSA). The problem of classifying the consonant counterparts in MSA is tackled here. The study considers four phonemes: /d^ˤ, ð^ˤ/ and their non-emphatic counterparts /d, ð/ respectively. An accurate automatic classification for those phonemes is to be achieved. Artificial neural networks (ANNs) are used for that purpose...

chapter

Preliminary Arabic speech emotion classification

Ali Meftah, Sid-Ahmed Selouani, Yousef A. Alotaibi

2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) > 179 - 182

2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)

In this paper, the acoustic features of pitch, intensity, formants, and speech rate are extracted and used to classify the following Arabic speech emotions: neutral, sad, happy, surprised, and angry. Three sentences spoken by four male and four female native Arabic speakers were selected from a newly developed Arabic speech corpus (KSUEmotions). Perception tests using human listeners yielded scores...

chapter

Emotion Detection through Speech and Facial Expressions

Krishna Mohan Kudiri, Abas Md. Said, M. Yunus Nayan

2014 International Conference on Computer Assisted System in Health > 26 - 31

2014 International Conference on Computer Assisted System in Health (CASH)

Human machine interaction is one of the most burgeoning area of research in the field of information technology. To date a majority of research in this field has been conducted using unimodal and multimodal systems with asynchronous data. Because of the above, the improper synchronization, which has become a common problem, due to that, the system complexity increases and the system response time...

chapter

Neural response based phoneme classification under noisy condition

Md.Shariful Alam, Wissam A. Jassim, Muhammad S.A. Zilany

2014 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS) > 175 - 179

2014 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)

Human listeners are capable of recognizing speech in noisy environment, while most of the traditional speech recognition methods do not perform well in the presence of noise. Unlike traditional Mel-frequency cepstral coefficient (MFCC)-based method, this study proposes a phoneme classification technique using the neural responses of a physiologically-based computational model of the auditory periphery...

chapter

Exploration of Deep Belief Networks for Vowel-like regions detection

Banriskhem K Khonglah, Biswajit Dev Sarma, S. R. M. Prasanna

2014 Annual IEEE India Conference (INDICON) > 1 - 5

2014 Annual IEEE India Conference (INDICON)

This work explores Deep Belief Networks (DBN) for the task of detecting Vowel-like regions (VLRs). Vowels and semivowels are considered as VLRs. By using vocal tract features at the input layer of DBN, we extract an evidence for VLRs by transforming the vocal tract features through multiple non-linear hidden layers. The linear classifier is used to predict the class of evidence, i.e.,whether it is...

chapter

Using k-Nearest Neighbor and Speaker Ranking for Phoneme Prediction

Muhammad Rizwan, David V. Anderson

2014 13th International Conference on Machine Learning and Applications > 383 - 387

2014 13th International Conference on Machine Learning and Applications (ICMLA)

Speech recognition systems are either based on parametric approach or non-parametric approach. Parametric based systems such as HMMs have been the dominant technology for speech recognition in the past decade. Despite a lot of advancements and enhancements in the design of these systems: key problems such as long term temporal dependence, etc. Has not yet been solved. Recently due to availability...

chapter

Random forest algorithm for improving the performance of speech/non-speech detection

Sincy V. Thambi, K. T. Sreekumar, C. Santhosh Kumar, P. C Reghu Raj

2014 First International Conference on Computational Systems and Communications (ICCSC) > 28 - 32

2014 First International Conference on Computational Systems and Communications (ICCSC)

Speech/non-speech detection (SND) distinguishes between speech and non-speech segments in recorded audio and video documents. SND systems can help reduce the storage space required when only speech segments from the audio documents are required, for example content analysis, spoken language identification, etc. In this work, we experimented with the use of time domain, frequency domain and cepstral...

1 ...
3
4
5
6
7
8
9

Keywords:
ACCURACY
SPEECH

Publication date

Set your own date range

Content availability

Available (958)
None (12)

Keywords

SPEECH RECOGNITION (465)
FEATURE EXTRACTION (332)
HIDDEN MARKOV MODELS (261)
TRAINING (241)
SPEECH PROCESSING (186)
ACOUSTICS (159)
DATABASES (139)
MEL FREQUENCY CEPSTRAL COEFFICIENT (132)
SUPPORT VECTOR MACHINES (119)
NOISE (98)
SPEAKER RECOGNITION (89)
DATA MINING (85)
NATURAL LANGUAGE PROCESSING (83)
EMOTION RECOGNITION (76)
ARTIFICIAL NEURAL NETWORKS (62)
ESTIMATION (60)
CLASSIFICATION ALGORITHMS (57)
SIGNAL TO NOISE RATIO (53)
AUTOMATIC SPEECH RECOGNITION (51)
COMPUTATIONAL MODELING (48)
VECTORS (48)
CORRELATION (45)
NOISE MEASUREMENT (44)
HUMANS (40)
CEPSTRAL ANALYSIS (39)
EDUCATIONAL INSTITUTIONS (39)
ALGORITHM DESIGN AND ANALYSIS (38)
MATHEMATICAL MODEL (38)
PATTERN CLASSIFICATION (38)
SIGNAL PROCESSING (38)
SPEAKER IDENTIFICATION (37)
LEARNING (ARTIFICIAL INTELLIGENCE) (36)
ROBUSTNESS (36)
TAGGING (36)
TESTING (36)
DECODING (35)
SPEECH CODING (35)
TRAINING DATA (35)
COMPUTERS (34)
GAUSSIAN PROCESSES (34)
MFCC (34)
ADAPTATION MODEL (33)
DATA MODELS (33)
CONFERENCES (32)
CONTEXT (32)
SPEECH SYNTHESIS (32)
HIDDEN MARKOV MODEL (31)
SPEECH ENHANCEMENT (31)
KERNEL (29)
MICROPHONES (29)
VISUALIZATION (29)
DICTIONARIES (27)
INDEXES (27)
SUPPORT VECTOR MACHINE (27)
TRANSFORMS (27)
EQUATIONS (25)
SIGNAL PROCESSING ALGORITHMS (25)
TEXT ANALYSIS (25)
AUDIO SIGNAL PROCESSING (24)
GMM (24)
SVM (24)
VOCABULARY (24)
CLASSIFICATION (23)
ENTROPY (23)
GAUSSIAN MIXTURE MODEL (23)
MACHINE LEARNING (23)
PRINCIPAL COMPONENT ANALYSIS (23)
STATISTICAL ANALYSIS (23)
ACOUSTIC SIGNAL PROCESSING (22)
OPTIMIZATION (22)
PROBABILITY (22)
SPEECH ANALYSIS (21)
ANALYTICAL MODELS (20)
COMPLEXITY THEORY (20)
SIGNAL CLASSIFICATION (20)
TIME FREQUENCY ANALYSIS (20)
DECISION TREES (19)
DELAY (19)
HMM (19)
MAXIMUM LIKELIHOOD ESTIMATION (19)
PATTERN RECOGNITION (19)
SEMANTICS (19)
SUPPORT VECTOR MACHINE CLASSIFICATION (19)
ELECTRONIC MAIL (18)
ERROR ANALYSIS (18)
LABELING (18)
REAL TIME SYSTEMS (18)
ROBOTS (18)
ADAPTATION MODELS (17)
FACE (17)
FILTERING (17)
HARMONIC ANALYSIS (17)
MUSIC (17)
NEURAL NETWORKS (17)
STRESS (17)
DETECTORS (16)
INFORMATION RETRIEVAL (16)
NATURAL LANGUAGES (16)
more

INFONA - science communication portal

Search results

Speech/Music Classification of Short Audio Segments

Design of a POS tagger using conditional random fields for Malayalam

Pronominal anaphora resolution using salience score for Malayalam

Analysis of ElectroGlottoGraph signal using Ensemble Empirical Mode Decomposition

2D semi-NMF of scale-frequency map for environmental sound classification

The use of semantic and acoustic features for open-domain TED talk summarization

Unnecessary utterance detection for avoiding digressions in discussion

Analysis of customer communication by employee in restaurant and lead time estimation

Domain specific audio indexing using linguistic information

A reliable speaker verification system based on LPCC and DTW

Feature extraction using Spectral Centroid and Mel Frequency Cepstral Coefficient for Quranic Accent Automatic Identification

Query sound-by-example video retrieval framework

Optimal selection of electrocorticographic sensors for voice activity detection

Classification of emphatic consonants and their counterparts in Modern Standard Arabic using neural networks

Preliminary Arabic speech emotion classification

Emotion Detection through Speech and Facial Expressions

Neural response based phoneme classification under noisy condition

Exploration of Deep Belief Networks for Vowel-like regions detection

Using k-Nearest Neighbor and Speaker Ranking for Phoneme Prediction

Random forest algorithm for improving the performance of speech/non-speech detection

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options