Search results

Items from 1 to 20 out of 56 results

chapter

The selection of spectral magnitude exponents for separating two sources is dominated by phase distribution not magnitude distribution

Stephen Voran

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 279 - 283

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

Separating an acoustic signal into desired and undesired components is an important and well-established problem. It is commonly addressed by decomposing spectral magnitudes after exponentiation and the choice of exponent has been studied from numerous perspectives. We present this exponent selection problem as an approximation to the actual underlying geometric situation. This approach makes apparent...

chapter

Exploration of the additivity approximation for spectral magnitudes

Stephen Voran

2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 1 - 5

2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

The separation of acoustic signals is often accomplished through subtractive decompositions of frequency-domain representations. This is typically enabled by the zero phase approximation or the un-correlated signals approximation but both of these are very coarse approximations in the mathematical sense. We investigate this disconnect between what works in practice and what is mathematically correct...

chapter

Frontiers of music technologies

Masataka Goto

2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 6

2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

Music technologies will open the future up to new ways of enjoying music both in terms of music appreciation and music creation. In this keynote speech, I introduce the frontiers of music technologies by showing some practical examples to demonstrate how end users can benefit from music signal processing, music understanding technologies, singing synthesis technologies, and music interfaces. For example,...

chapter

Is audio signal processing still useful in the era of machine learning?

Emmanuel Vincent

2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 7

2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

Audio signal processing has long been the obvious approach to problems such as microphone array processing, active noise control, or speech enhancement. Yet, it is increasingly being challenged by black-box machine learning approaches based on, e.g., deep neural networks (DNN), which have already achieved superior results on certain tasks. In this talk, I will try to convince that machine learning...

chapter

Mapping Arabic acoustic parameters to their articulatory features using neural networks

Yousef Ajami Alotaibi, Yasser Mohammad Seddiq

2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE) > 409 - 414

2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)

A mapping system based on an artificial neural network was designed, trained, and tested to map Arabic acoustic parameters to their corresponding articulatory features. The main objective of the study was to find the correlation between these two different types of features. To train and test the system, an in-house database was created for all 29 Arabic alphabets as carrier words for our intended...

chapter

Gaussian mixture models with class-dependent features for speech emotion recognition

Rafael Iriya, Miguel Arjona Ramirez

2014 IEEE Workshop on Statistical Signal Processing (SSP) > 480 - 483

2014 IEEE Statistical Signal Processing Workshop (SSP)

In this paper, we propose models for emotion recognition from speech based on class-dependent features and Gaussian mixture models (GMM). Seven emotions are identified (Happiness, Fear, Neutral, Disgust, Anger, Boredom and Sadness) with a small set of features for each class. Results show that our system outperforms the single-stage classifier, with a 82.41% (74.86% in single-stage) overall recognition...

chapter

Invariance of the distributions of normalized Gram matrices

Stephen D. Howard, Songsri Sirianunpiboon, Douglas Cochran

2014 IEEE Workshop on Statistical Signal Processing (SSP) > 352 - 355

2014 IEEE Statistical Signal Processing Workshop (SSP)

Normalized Gram matrices formed from multiple vectors of sensor data, and functions of the eigenvalues of such matrices in particular, have a long history in connection with multiple-channel detection. The determinant and various other functions of the eigenvalues of these matrices arise as detection statistics in a variety of multichannel problems, and knowledge of their distributions under the H...

chapter

Affect burst recognition using multi-modal cues

Bekir Berker Turker, Shabbir Marzban, Engin Erzin, Yucel Yemez, more

2014 22nd Signal Processing and Communications Applications Conference (SIU) > 1608 - 1611

2014 22nd Signal Processing and Communications Applications Conference (SIU)

Affect bursts, which are nonverbal expressions of emotions in conversations, play a critical role in analyzing affective states. Although there exist a number of methods on affect burst detection and recognition using only audio information, little effort has been spent for combining cues in a multi-modal setup. We suggest that facial gestures constitute a key component to characterize affect bursts,...

chapter

Comparison of MFCC, LPCC and PLP features for the determination of a speaker's gender

Ergun Yucesoy, Vasif V. Nabiyev

2014 22nd Signal Processing and Communications Applications Conference (SIU) > 321 - 324

2014 22nd Signal Processing and Communications Applications Conference (SIU)

Gender information is a distinctive and the most important property in a speech. Determination of this information from a speech signal is a substantial subject. Gender information used for various purposes in many applications, provides the less error rate by defining the gender-dependent speech/speaker models. In this study, a system determining the gender of a speaker with no dependency from a...

chapter

Recognition of real time voiced direction commands

Selim Aras, Mehmet Ozturk, Ali Gangal

2014 22nd Signal Processing and Communications Applications Conference (SIU) > 2245 - 2248

2014 22nd Signal Processing and Communications Applications Conference (SIU)

In this paper, five different voiced direction command recognition is realized in real-time. Speech detection step is performed on voice recordings that includes five different direction commands. Mel frequency cepstrum coefficients (MFCC) and Linear Predictive Coding (LPC) coefficients are utilized to extract feature vectors and training data set. k-nearest neighbor classification algorithm is used...

chapter

Detection of Alzheimer's disease using prosodic cues in conversational speech

Ali Khodabakhsh, Serhan Kuscuoglu, Cenk Demiroglu

2014 22nd Signal Processing and Communications Applications Conference (SIU) > 1003 - 1006

2014 22nd Signal Processing and Communications Applications Conference (SIU)

Automatic diagnosis of the Alzheimer's disease as well as monitoring of the diagnosed patients can make significant economic impact on societies. We investigated an automatic diagnosis approach through the use of speech based features. As opposed to standard tests that are mostly focused on memory recall, spontaneous conversations are carried with the subjects in informal settings. Prosodic speech...

chapter

Audio-based gender and age identification

O. Ozgur Bozkurt, Z. Cihan Taygi

2014 22nd Signal Processing and Communications Applications Conference (SIU) > 1371 - 1374

2014 22nd Signal Processing and Communications Applications Conference (SIU)

Nowadays interaction between humans and computers is increasing rapidly. Efficiency and comfort of these interactions depend on the availability of user information to computers. Gender, age and emotional state are most the most fundamental pieces of these information. Extraction of such information from audio or video data is an important research area. There are several works on different languages...

chapter

Extracting the prosodic information for Turkish broadcast news data and using on the sentence segmentation task

Dogan Dalva, Izel D. Revidi, Umit Guz, Hakan Gurkan

2014 22nd Signal Processing and Communications Applications Conference (SIU) > 1810 - 1813

2014 22nd Signal Processing and Communications Applications Conference (SIU)

In this study, extracting the prosodic information for Turkish Broadcast News Data using the open source tools and comparing the sentence segmentation performances of these grouped prosodic information on the raw data obtained as an output from the Automatic Speech Recognition System are established. Especially for the sentence segmentation task, a very promising prosodic feature set is obtained.

chapter

Protocol and baseline for experiments on Bogazici University Turkish emotional speech corpus

Heysem Kaya, Albert Ali Salali, Sadik Fikret Gurgen, Hazim Ekenel

2014 22nd Signal Processing and Communications Applications Conference (SIU) > 1698 - 1701

2014 22nd Signal Processing and Communications Applications Conference (SIU)

This study aims at presenting an emotional corpus collected at Bog˘aziçi University / Electrical and Electronics Department, on which no previous signal processing and machine learning study was done for classification purposes. It also aims at providing the protocol for further experiments on this corpus. The emotional corpus consists of 484 speech utterances from 11 amateur actors acting 11 emotionally...

chapter

Maximum-likelihood based 3D acoustical signature estimation

Banu Gunel

2014 22nd Signal Processing and Communications Applications Conference (SIU) > 786 - 789

2014 22nd Signal Processing and Communications Applications Conference (SIU)

An audio recording, made in a real environment, carries an acoustical signature which changes according to the acoustical characteristics of the environment and the recording positions. This signature which is similar to a 3D room impulse response contains the directions, levels and arrival times of the direct source and reflections. Although it is easy to obtain reverberant recordings by convolving...

chapter

Nonconvex compressed sensing with partially known support

Taner Ince, Arif Nacaroglu, Nurdal Watsuji

2012 20th Signal Processing and Communications Applications Conference (SIU) > 1 - 4

2012 20th Signal Processing and Communications Applications Conference (SIU)

We study recovering sparse and compressible signals using l_p minimization with p < 1 when some part of the support of the signal is known a priori. Sparse reconstruction method based on l_p minimization with partially known set is proposed. Recovery conditions of l_p minimization with partially known support is given. Theoretical results show that l_p minimization with partially known set is...

chapter

Towards the influence of vibration on evaluation of speech utterances in mobile devices

Henrik von Coler, Shiva Sundaram, Robert Schleicher, Gabriel Curio

2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 297 - 300

2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

In this paper the influence of vibrotactile stimuli on the evaluation of loudness of speech is investigated. The speech utterances consisted of context-free consonant-vowel-consonant (CVC) and vowel-consonant-vowel (VCV) utterances played over headphones on a mobile phone. The phone's standard vibration was used as the tactile stimulus. Using an AB/X paradigm, 32 untrained subjects evaluated loudness...

chapter

Wave steganography

A Umbarkar, A Joshi, A Jadhav

2010 IEEE International Conference on Computational Intelligence and Computing Research > 1 - 5

2010 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC 2010)

Steganography is the art of hiding message in order to have a secure data communication. This paper addresses a technique for wave steganography. In this paper we proposed the idea to replace bits according to the distortion afforded with lossy or lossless and recovery methods. Carrier file bits are replaced by message file. Message embeded in this method is in form of wave. Hidden message can be...

chapter

Wave Steganography Approach by Modified LSB

A.J. Umbarkar, A.P. Joshi, A.A. Jadhav, A.R. Buchade

2009 Second International Conference on Emerging Trends in Engineering&Technology > 862 - 865

2009 2nd International Conference on Emerging Trends in Engineering and Technology (ICETET 2009)

Steganography is the art of hiding message in order to have a secure communication. In this paper, we present a novel technique for wave steganography for covert communication. The basic idea proposed in this paper is replacement of the bits according to the distortion afforded, with lossy or lossless hiding and recovery. Numbers of bits of the samples in cover file are replaced in accordance with...

chapter

An enhanced empirical modal decomposition without sifting

N. Azzaoui, H. Snoussi, J. Duchene

2009 IEEE/SP 15th Workshop on Statistical Signal Processing > 796 - 799

2009 IEEE/SP 15th Workshop on Statistical Signal Processing (SSP)

In this work, a new empirical mode decomposition (EMD) is introduced. It does not use extrema envelopes nor sifting procedure but the decomposition is only based on a direct calculation of its components from inflexion points. Our technique has many advantages: firstly, in contrast to the classical EMD, we give an analytical formula for the decomposition. Finally, a simulation study shows its efficiency.

Data set:
ieee
Keywords:
CONFERENCES
SIGNAL PROCESSING
SPEECH

Publication date

Set your own date range

INFONA - science communication portal

Search results

The selection of spectral magnitude exponents for separating two sources is dominated by phase distribution not magnitude distribution

Exploration of the additivity approximation for spectral magnitudes

Frontiers of music technologies

Is audio signal processing still useful in the era of machine learning?

Mapping Arabic acoustic parameters to their articulatory features using neural networks

Gaussian mixture models with class-dependent features for speech emotion recognition

Invariance of the distributions of normalized Gram matrices

Affect burst recognition using multi-modal cues

Comparison of MFCC, LPCC and PLP features for the determination of a speaker's gender

Recognition of real time voiced direction commands

Detection of Alzheimer's disease using prosodic cues in conversational speech

Audio-based gender and age identification

Extracting the prosodic information for Turkish broadcast news data and using on the sentence segmentation task

Protocol and baseline for experiments on Bogazici University Turkish emotional speech corpus

Maximum-likelihood based 3D acoustical signature estimation

Nonconvex compressed sensing with partially known support

Towards the influence of vibration on evaluation of speech utterances in mobile devices

Wave steganography

Wave Steganography Approach by Modified LSB

An enhanced empirical modal decomposition without sifting

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options