Search results

chapter

Understanding basis functions for vowels based on non-negative matrix factorization

Nandini C Nag, Milind Shah

2017 International Conference on Nascent Technologies in Engineering (ICNTE) > 1 - 6

2017 International Conference on Nascent Technologies in Engineering (ICNTE)

With the advent of hands free devices, speech recognition is of utmost importance but miserably fails to be perfect in a cock-tail party environment without speech separation or speech denoising. There are various techniques available for speech separation but the one technique used nowadays is non-negative matrix factorization (NMF). Non-negative matrix factorization decomposes the mixed signal into...

chapter

Kalman filter based stereo system identification with auto- and cross-decorrelation

Stefan Kuhl, Christiane Antweiler, Tobias Hubschen, Peter Jax

2017 Hands-free Speech Communications and Microphone Arrays (HSCMA) > 181 - 185

2017 Hands-free Speech Communications and Microphone Arrays (HSCMA)

In stereo or multi-channel system identification, the most critical problems regarding online identification, e.g., for acoustic echo control, are the correlation properties of the excitation signals of the different audio channels. In this paper the impact of both the auto- and cross-correlation properties is considered and investigated. A new system combining appropriate decorrelation techniques...

chapter

Singing characterization using temporal and spectral features in Indian musical notes

Shivam Sharma, V. K. Mittal

2016 International Conference on Signal Processing and Communication (ICSC) > 346 - 351

2016 International Conference on Signal Processing and Communication (ICSC)

Pitch extraction from a multi pitched music signal significantly relies on the training data for tasks like enhanced music-voice separation. This paper aims at identifying characteristic temporal and spectral features, using speech processing techniques that can help obtain crucial information, leading to a better understanding of the music structure. Towards this goal, the F0 contour has been studied...

chapter

Lexicon-based sentiment analysis of Indian Union Budget 2016–17

Moonis Shakeel, Vikram Karwal

2016 International Conference on Signal Processing and Communication (ICSC) > 299 - 302

2016 International Conference on Signal Processing and Communication (ICSC)

In this work sentiment analysis of annual budget for Financial year 2016–17 is done. Text mining is used to extract text data from the budget document and to compute the word association of significant words and their correlation in computed with the associated words. Word frequency and the corresponding word cloud is plotted. The analysis is done in R software. The corresponding sentiment score is...

chapter

Gender identification from speech signal by examining the speech production characteristics

Esther Ramdinmawii, V. K. Mittal

2016 International Conference on Signal Processing and Communication (ICSC) > 244 - 249

2016 International Conference on Signal Processing and Communication (ICSC)

The term gender identification deals with finding out the gender of a person from his or her voice. Gender identification has been implemented in several Automatic Speaker Recognition (ASR) systems and has proved to be of great significance. The use of gender identification in today's technology makes it easier for user authentication and identification in high security systems. In this paper, we...

chapter

Securing digital information using the Rakaposhi system

Belmeguenai Aissa, Mansouri Khaled, Derouiche Nadir

2016 17th International Conference on Sciences and Techniques of Automatic Control and Computer Engineering (STA) > 298 - 307

2016 17th International Conference on Sciences and Techniques of Automatic Control and Computer Engineering (STA)

In this work, we presented an approach based on Rakaposhi system for securing digital information. The approach is developed to encrypt and decrypt digital information. A gray level image, speech data saved as .wav and text file recorded as .txt are taken to validate the proposed approach. The approach is easy and highly efficient. Some tests are performed to validate the performance of the proposed...

chapter

A statistical analysis on the impact of noise on MFCC features for speech recognition

Utpal Bhattacharjee, Swapnanil Gogoi, Rubi Sharma

2016 International Conference on Recent Advances and Innovations in Engineering (ICRAIE) > 1 - 5

2016 International Conference on Recent Advances and Innovations in Engineering (ICRAIE)

Noise is omnipresent in almost all acoustical environments. The investigation presents here seeks to quantify the impact of noise on mel-frequency cepstral coefficients (MFCC) of speech signal. MFCC is one of the most commonly used features for speech recognition systems. However, it has been observed that performance of MFCC based system degrades drastically with changing noise levels and noise types...

chapter

Pitch and formant estimation of bangla speech signal using autocorrelation, cepstrum and LPC algorithm

Muhammad Navid Anjum Aadit, Sharadindu Gopal Kirtania, Mehnaz Tabassum Mahin

2016 19th International Conference on Computer and Information Technology (ICCIT) > 371 - 376

2016 19th International Conference on Computer and Information Technology (ICCIT)

In this paper, we present comparative study of digital speech processing on Bangla speech signal. We represent oral characteristics of Bangla alphabet in terms of pitch and formant. We worked with both vowels and consonants to show their difference in practical use. We take oral speech signals as voice record and extract phonemes to analyze in both time and frequency domains. Both male and female...

chapter

Automatic personality prediction from audiovisual data using random forest regression

Berkay Aydin, Ahmet Alp Kindiroglu, Oya Aran, Lale Akarun

2016 23rd International Conference on Pattern Recognition (ICPR) > 37 - 42

2016 23rd International Conference on Pattern Recognition (ICPR)

In this paper, we focus on describing the method we designed for automatic perceived personality prediction. We present a simple model that uses three different sets of features: nonverbal audio cues, visual cues from video, and facial landmark points. The model uses a random decision forest to do regression from the extracted features. As we discuss in Section 4, this multimodal model performs relatively...

chapter

A noise masking method with adaptive thresholds based on CASA

Feng Bao, Waleed H. Abdulla

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 5

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In this paper, we propose a novel noise masking method based on Computational Auditory Scene Analysis by using an adaptive factor. Although it has succeeded in the field of speech separation and speech enhancement to some extent, the usage of fixed thresholds used for segregation and labeling heavily affects the processing performance. Focusing on this issue, the proposed method utilizes the Normalized...

chapter

Automatic pronunciation assessment of Korean spoken by L2 learners using best feature set selection

Hyuksu Ryu, Hyejin Hong, Sunhee Kim, Minhwa Chung

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 6

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

This paper proposes a method for automatic pronunciation assessment of Korean spoken by L2 learners by selecting the best feature set from a collection of the most well-known features in the literature. The L2 Korean Speech Corpus is used for assessment modeling, where the native languages of the L2 learners are English, Chinese, Japanese, Russian, and Mongolian. In our system, learners' speech is...

chapter

Speech analysis and depression

Tan Tze Ern Shannon, Dai Jingwen Annie, See Swee Lan

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 4

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In this paper, the correlation between the speech features of the vowel /a/ and depression severity was investigated, so as to derive a depression severity meter mobile application that can accurately detect depression quantitatively. Results showed a correlation between depression severity and speech features, and an application prototype was created and tested to assess for predictive accuracy of...

chapter

Comparative study of multi-stage classification scheme for recognition of Lithuanian speech emotions

Tatjana Liogiene, Gintautas Tamulevicius

2016 Federated Conference on Computer Science and Information Systems (FedCSIS) > 483 - 486

2016 Federated Conference on Computer Science and Information Systems (FedCSIS)

This paper presents the experimental study of multi-stage classification based recognition of Lithuanian speech emotions. Three different criteria for feature selection were compared for this purpose: Maximal Efficiency, Minimal Cross-Correlation feature criterions, and the Sequential Feature Selection. A large database of spoken emotional Lithuanian language was used in this experiment - each of...

chapter

An improved pitch contour formulation for Malay language storytelling Text-to-Speech (TTS)

Izzad Ramli, Nursuriati Jamil, Noraini Seman, Norizah Ardi

2016 IEEE Industrial Electronics and Applications Conference (IEACon) > 250 - 255

2016 IEEE Industrial Electronics and Applications Conference (IEACon)

In this paper, an improved pitch contour formulation is introduced by modifying the existing pitch contour sinusoidal function. The aim is to convert neutral speech into storytelling speech in Malay Language. Our speech datasets (neutral and storytelling speech) were recorded by a male and a female professional speaker. They contain 116 speech sentences, 1,164 words, and 2,732 syllables. For storytelling...

chapter

The communication potential estimation of Chinese borrowings in English

Rui Xin, Xiaolan Lei

2016 International Conference on Asian Language Processing (IALP) > 189 - 192

2016 International Conference on Asian Language Processing (IALP)

With the improvement of China's international influence, more Chinese words are borrowed into English. To find out how popular Chinese borrowings in English, a questionnaire is conducted among English speakers (Chinese natives are excluded). Swaan's Q-value model is hereby employed to analyze the data. The result shows that Chinese borrowings in English vary in Q-value and the ones related to economy...

chapter

An EEG and fTCD based BCI for control

Aya Khalaf, Matthew Sybeldon, Ervin Sejdic, Murat Akcakaya

2016 50th Asilomar Conference on Signals, Systems and Computers > 1285 - 1289

2016 50th Asilomar Conference on Signals, Systems and Computers

Brain-computer interfaces (BCIs) promise to promote a novel access channel for functional independence for individuals with severe speech and physical impairment (SSPI) that can occur as a result of numerous neurological diseases and injuries. Current BCI systems lack the robustness and accuracy to allow individuals with SSPI to complete tasks required for independent living (e.g. communication or...

chapter

Models for objective evaluation of dysarthric speech from data annotated by multiple listeners

Ming Tu, Yishan Jiao, Visar Berisha, Julie M. Liss

2016 50th Asilomar Conference on Signals, Systems and Computers > 827 - 830

2016 50th Asilomar Conference on Signals, Systems and Computers

In subjective evaluation of dysarthric speech, the inter-rater agreement between clinicians can be low. Disagreement among clinicians results from differences in their perceptual assessment abilities, familiarization with a client, clinical experiences, etc. Recently, there has been interest in developing signal processing and machine learning models for objective evaluation of subjective speech quality...

chapter

Characterization of the relationship between semantic and structural language features in psychiatric diagnosis

N. B. Mota, F. Carrillo, D. F. Slezak, M. Copelli, more

2016 50th Asilomar Conference on Signals, Systems and Computers > 836 - 838

2016 50th Asilomar Conference on Signals, Systems and Computers

Psychiatry describes speech symptoms that are indicative of disorganized thought, but measuring them is not easy. With natural language processing tools, it is possible to quantify psychiatric symptoms. Graph representations of word trajectories and semantic incoherence have independently been shown to predict the Schizophrenia diagnosis. Both analyses assess thought organization through speech, but...

chapter

Single channel speech segregation using cepstrum method

Kukku Merin Skariah, M. S. Lekshmi

2016 International Conference on Emerging Technological Trends (ICETT) > 1 - 5

2016 International Conference on Emerging Technological Trends (ICETT)

In natural environment speech signal is affected by various acoustic interference. Many of the applications in audio signal processing such as automatic speech recognition, telecommunications and hearing aid applications etc. requires an effective way of segregating the target speech from the mixed speech. Pitch information has an important role in the field of audio signal processing, especially...

chapter

The correlation between signal distance and consonant pronunciation in Mandarin words

Huijun Ding, Chenxi Xie, Lei Zeng, Yang Xu, more

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

In Mandarin language speaking, some consonant and vowel pairs are hard to be distinguished and pronounced clearly even for some native speakers. This study investigates the signal distance between consonants compared in pairs from the signal processing point of view to reveal the correlation of signal distance and consonant pronunciation. Some popular speech quality objective measures are innovatively...

INFONA - science communication portal

Search results

Understanding basis functions for vowels based on non-negative matrix factorization

Kalman filter based stereo system identification with auto- and cross-decorrelation

Singing characterization using temporal and spectral features in Indian musical notes

Lexicon-based sentiment analysis of Indian Union Budget 2016–17

Gender identification from speech signal by examining the speech production characteristics

Securing digital information using the Rakaposhi system

A statistical analysis on the impact of noise on MFCC features for speech recognition

Pitch and formant estimation of bangla speech signal using autocorrelation, cepstrum and LPC algorithm

Automatic personality prediction from audiovisual data using random forest regression

A noise masking method with adaptive thresholds based on CASA

Automatic pronunciation assessment of Korean spoken by L2 learners using best feature set selection

Speech analysis and depression

Comparative study of multi-stage classification scheme for recognition of Lithuanian speech emotions

An improved pitch contour formulation for Malay language storytelling Text-to-Speech (TTS)

The communication potential estimation of Chinese borrowings in English

An EEG and fTCD based BCI for control

Models for objective evaluation of dysarthric speech from data annotated by multiple listeners

Characterization of the relationship between semantic and structural language features in psychiatric diagnosis

Single channel speech segregation using cepstrum method

The correlation between signal distance and consonant pronunciation in Mandarin words

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options