Szukanie zaawansowane

Szukanie zaawansowane w ludziach

Od:

Do:

Pozycje od 21 do 38 spośród 38 wyników

Poprzednia

Następna

rozdział

Classifying NMF components based on vector similarity for speech and music separation

Nengheng Zheng, Yi Cai, Xia Li, Tan Lee

Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference > 1 - 6

2012 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

This paper presents a nonnegative matrix factorization (NMF) components classification algorithm for single-channel speech and music separation. Music only and music-speech mixture segments are firstly classified from the audio stream via audio segmentation technique. Then NMF is applied for signal decomposition. The basis matrix of the NMF output of music only segments provides the prior knowledge...

rozdział

FFT-based spectro-temporal analysis and synthesis of sounds

Chung-Chien Hsu, Ting-Han Lin, Tai-Shih Chi

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5388 - 5391

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The concept of the two-dimensional spectro-temporal modulation filtering of the auditory model [1] is implemented for the FFT spectrogram. It analyzes the spectrogram in terms of the temporal dynamics and the spectral structures of the sound. The overlap and add (OLA) method, which is more convenient and reliable than the iterative-projection method proposed in [1], is used to invert the FFT spectrogram...

rozdział

The psychoacoustic approach towards enhancing speech intelligibility in noise

Paul Yaozhu Chan, Minghui Dong, Ling Cen, Haizhou Li

2010 7th International Symposium on Chinese Spoken Language Processing > 238 - 241

7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010)

In this paper, we propose a psychoacoustic approach towards enhancing speech intelligibility in noise. Understanding the relationship between the short-term spectral movement of a sound and a listener's sensitivity towards it, we conjecture that humans rely greatly on Inter-Phoneme Spectral Gradients (IPSGs) to distinguish each phoneme, especially when the short-term speech spectrum is masked by extremely...

rozdział

Spectrogram dimensionality reductionwith independence constraints

Kevin W Wilson, Bhiksha Raj

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 1938 - 1941

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

We present an algorithm to find a low-dimensional decomposition of a spectrogram by formulating this as a regularized non-negative matrix factorization (NMF) problem with a regularization term chosen to encourage independence. This algorithm provides a better decomposition than standard NMF when the underlying sources are independent. It is directly applicable to non-square matrices, and it makes...

rozdział

Latent-variable decomposition based dereverberation of monaural and multi-channel signals

R Singh, B Raj, P Smaragdis

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 1914 - 1917

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

We present an algorithm to dereverberate single- and multi-channel audio recordings. The proposed algorithm models the magnitude spectrograms of clean audio signals as histograms drawn from a multinomial process. Spectrograms of reverberated signals are obtained as histograms of draws from the PDF of the sum of two random variables, one representing the spectrogram of clean speech and the second the...

rozdział

Optimal filters for extraction and separation of periodic sources

Mads Graesbll Christensen, Andreas Jakobsson

2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers > 376 - 379

2009 43rd Asilomar Conference on Signals, Systems and Computers

In this paper, the problem of extracting periodic signals, like voiced speech or tones in music, from noisy observations or mixtures of periodic signals is considered, and, in particular, the problem of designing filters for such a task. We propose a novel filter design that 1) is specifically aimed at extracting periodic signals, 2) is optimal given the observed signal and thus signal-adaptive, and...

rozdział

Music source separation synthesis using Multiple Input Spectrogram Inversion

D. Gunawan, D. Sen

2009 IEEE International Workshop on Multimedia Signal Processing > 1 - 5

2009 IEEE International Workshop on Multimedia Signal Processing (MMSP)

In this paper, we propose a novel method of refining the time-domain synthesis of individual source estimates from a single channel mixture. Employing a closed-loop architecture, the algorithm refines the synthesis of each source by iteratively estimating the phase of the sources, given the estimates of the source magnitude spectra and a single channel time-domain mixture. The performance of the algorithm...

rozdział

Missing data imputation using compressive sensing techniques for connected digit recognition

J. Gemmeke, B. Cranen

2009 16th International Conference on Digital Signal Processing > 1 - 8

2009 16th International Conference on Digital Signal Processing (DSP)

An effective way to increase the noise robustness of automatic speech recognition is to label noisy speech features as either reliable or unreliable (missing) prior to decoding, and to replace the missing ones by clean speech estimates. We present a novel method based on techniques from the field of Compressive Sensing to obtain these clean speech estimates. Unlike previous imputation frameworks which...

rozdział

Learning speech features in the presence of noise: Sparse convolutive robust non-negative matrix factorization

R. de Frein, S.T. Rickard

2009 16th International Conference on Digital Signal Processing > 1 - 6

2009 16th International Conference on Digital Signal Processing (DSP)

We introduce a non-negative matrix factorization technique which learns speech features with temporal extent in the presence of non-stationary noise. Our proposed technique, namely Sparse convolutive robust non-negative matrix factorization, is robust in the presence of noise due to our explicit treatment of noise as an interfering source in the factorization. We derive multiplicative update rules...

rozdział

Note onset detection for the transcription of polyphonic piano music

C.G.v.d. Boogaart, R. Lienhart

2009 IEEE International Conference on Multimedia and Expo > 446 - 449

2009 IEEE International Conference on Multimedia and Expo (ICME)

Transcription of music is the process of generating a symbolic representation such as a score sheet or a MIDI file from an audio recording of a piece of music. A statistical machine learning approach for detecting note onsets in polyphonic piano music is presented. An area from the spectrogram of the sound is concatenated into one feature vector. A cascade of boosted classifiers is used for dimensionality...

rozdział

An algorithm for speech segregation of co-channel speech

S. Vishnubhotla, C.Y. Espy-Wilson

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 109 - 112

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper introduces an algorithm to separate speech streams from a single-channel speech mixture. Most current speech segregation algorithms allocate speech regions to participating speakers depending on which speaker dominates in which spectro-temporal region. The proposed method is a different approach to speech segregation, in that it separates the participating speaker streams rather than decide...

rozdział

Sparse imputation for noise robust speech recognition using soft masks

J.F. Gemmeke, B. Cranen

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4645 - 4648

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

In previous work we introduced a new missing data imputation method for ASR, dubbed sparse imputation. We showed that the method is capable of maintaining good recognition accuracies even at very low SNRs provided the number of mask estimation errors is sufficiently low. Especially at low SNRs, however, mask estimation is difficult and errors are unavoidable. In this paper, we try to reduce the impact...

rozdział

Robust speech dereverberation based on non-negativity and sparse nature of speech spectrograms

H. Kameoka, T. Nakatani, T. Yoshioka

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 45 - 48

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper presents a blind dereverberation method designed to recover the subband envelope of an original speech signal from its reverberant version. The problem is formulated as a blind deconvolution problem with non-negative constraints, regularized by the sparse nature of speech spectrograms. We derive an iterative algorithm for its optimization, which can be seen as a special case of the non-negative...

rozdział

Wavelet based speech signal de-noising using hybrid thresholding

M.G. Sumithra, K. Thanuskodi

2009 International Conference on Control, Automation, Communication and Energy Conservation > 1 - 7

2009 International Conference on Control, Automation, Communication and Energy Conservation (INCACEC)

The wavelet transform has become a powerful tool of signal analysis and is widely used in many applications including signal detection and de-noising. Wavelet thresholding de-noising techniques provide a new way to reduce background noise in speech signal. However, the soft thresholding is best in reducing noise but worst in preserving edges, and hard thresholding is best in preserving edges but worst...

rozdział

An Improved LSA-MMSE Speech Enhancement Approach Based on Auditory Perception

Linyu Gong, Changxing Chen, Qi Chen, Haoxiang Xu

2008 International Seminar on Future Information Technology and Management Engineering > 292 - 295

2008 International Seminar on Future Information Technology and Management Engineering

Gain function of traditional enhancement algorithm is to estimate every signal spectral component, therefore, this introduce relatively more speech distortion. To improve the effect of speech enhancement at low signal-to-noise ratio (SNR), this paper proposed a optimal speech enhancement scheme. Based on auditory perception properties, no estimator for noise masked spectrum and classical enhancement...

rozdział

A new technique for street noise reduction in signal processing applications

C.V. Rama Rao, M.B.R. Murthy, K.A. Sheela

TENCON 2008 - 2008 IEEE Region 10 Conference > 1 - 5

TENCON 2008 - 2008 IEEE Region 10 Conference

This paper proposes a two stage hybrid speech enhancement system with nonuniform subbands. Frequency bins after Fourier transform are nonuniformly grouped to reduce the computations in calculating the spectral gain. First stage includes a soft decision gain modification and applied to the Ephraim-Malah gain function based on minimum mean square error estimation (MMSE) and a psychoacoustic masking...

rozdział

A Neural Network based local SNR estimation for estimating spectral masks

A.H. Hadjahmadi, M.M. Homayounpour, S.M. Ahadi

2008 International Symposium on Telecommunications > 608 - 613

2008 International Symposium on Telecommunications

In this work, we present a new mask estimation technique that uses a neural network classifier to determine the reliability of spectrographic elements. In addition some different kinds of features used for classification were compared that make no assumptions about the corrupting noise signal, but rather exploit spectrographic characteristics of the speech signal. The performance of the proposed method...

rozdział

Noise suppression based on approximate KLT with wavelet packet expansion

Chung-Hsien Yang, Jhing-Fa Wang

2002 IEEE International Conference on Acoustics, Speech, and Signal Processing > 1 > I-565 - I-568

Proceedings of ICASSP '02

In this paper, we perform the noise suppression based on approximate Karhunen-Loeve transform (KL T). The discrete cosine transform(DCT) has been a good candidate for approximate KLT when the signal is modeled as an autoregressive process. However, for nonstationary signals, wavelet transform is more capable than DCT while approximating KLT. To calculate approximate KLT, we first represent the signal...

Poprzednia

Następna

Opcje filtrowania

Słowa kluczowe:
SIGNAL TO NOISE RATIO
SPEECH
SPECTROGRAM

Data publikacji

Ustaw własny zakres dat

Typ publikacji

książka (36)
artykuł (2)

Słowa kluczowe

SPEECH ENHANCEMENT (16)
NOISE MEASUREMENT (15)
NOISE (6)
SOURCE SEPARATION (6)
FEATURE EXTRACTION (5)
ESTIMATION (4)
NON-NEGATIVE MATRIX FACTORIZATION (4)
ROBUSTNESS (4)
SPEECH PROCESSING (4)
SPEECH RECOGNITION (4)
ACCURACY (3)
ACOUSTIC SIGNAL PROCESSING (3)
DICTIONARIES (3)
MATHEMATICAL MODEL (3)
MATRIX DECOMPOSITION (3)
MODULATION (3)
NONNEGATIVE MATRIX FACTORIZATION (3)
SIGNAL DENOISING (3)
SPECTRAL ANALYSIS (3)
SPEECH SEPARATION (3)
TIME-FREQUENCY ANALYSIS (3)
ACOUSTIC SIGNAL DETECTION (2)
ALGORITHM DESIGN AND ANALYSIS (2)
AUDIO RECORDING (2)
COVARIANCE MATRIX (2)
DATA MINING (2)
EQUATIONS (2)
HARMONIC ANALYSIS (2)
HIDDEN MARKOV MODELS (2)
ITERATIVE METHODS (2)
LEAST MEAN SQUARES METHODS (2)
LINEAR PROGRAMMING (2)
MAGNITUDE SPECTRUM (2)
MUSIC (2)
NOISE REDUCTION (2)
NOISE SUPPRESSION (2)
REVERBERATION (2)
SPARSE MATRICES (2)
SPEAKER RECOGNITION (2)
SPECTRO-TEMPORAL MODULATION FILTERING (2)
SPEECH INTELLIGIBILITY (2)
TRAINING (2)
ACOUSTIC SIGNAL ANALYSIS (1)
ACOUSTICS (1)
ADAPTATION MODELS (1)
ALPHA DIVERGENCE OBJECTIVE (1)
ARTIFICIAL NEURAL NETWORKS (1)
ASR (1)
AUDIO SIGNAL PROCESSING (1)
AUDIO SOURCE SEPARATION (1)
AUDITORY PATTERN (1)
AUDITORY PERCEPTION (1)
AUDITORY PERCEPTION PROPERTIES (1)
AUDITORY SCENE ANALYSIS (1)
AURORA-2 (1)
AUTOMATIC SPEECH RECOGNITION (1)
BACKGROUND NOISE (1)
BAND PASS FILTERS (1)
BISMUTH (1)
BLIND DECONVOLUTION PROBLEM (1)
BLIND DEREVERBERATION METHOD (1)
BLIND SOURCE SEPARATION (1)
CLEAN AUDIO SIGNALS (1)
CLOSED-LOOP ARCHITECTURE (1)
CO-CHANNEL SPEECH (1)
CO-CHANNEL SPEECH SEGREGATION (1)
COLORED NOISE (1)
COMPLEX SPECTROGRAM (1)
COMPRESSIVE SENSING (1)
COMPRESSIVE SENSING TECHNIQUES (1)
COMPUTATIONAL MODELING (1)
CONNECTED DIGIT RECOGNITION (1)
CONVOLUTIONAL NEURAL NETWORK (1)
DATABASES (1)
DECISION MAKING (1)
DECISION SUPPORT SYSTEMS (1)
DECISION THEORY (1)
DECONVOLUTION (1)
DECORRELATION (1)
DEEP NEURAL NETWORK (1)
DEEP NEURAL NETWORKS (1)
DENOISED SPEECH REPRESENTATIONS (1)
DEREVERBERATION (1)
DIGITAL SIGNAL PROCESSING (1)
DISCRETE WAVELET TRANSFORM (1)
DISPERSION (1)
DISTORTION (1)
ENDPOINT DETECTION (1)
EPHRAIM-MALAH GAIN FUNCTION (1)
ERROR ANALYSIS (1)
ESTIMATION THEORY (1)
FEATURE CLASSIFICATION (1)
FILTERING THEORY (1)
FIXED-LENGTH SPEECH EXEMPLARS (1)
FORMANT CONTRAST (1)
FOURIER TRANSFORM (1)
FRAME-BY-FRAME BASIS (1)
więcej

INFONA - portal komunikacji naukowej

Szukanie zaawansowane