Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
Detecting depression via speech is an attractive topic in recent years. Significant correlation was found between speech pause time and depressive severity. In the present study, 92 depressed patients and 92 age-, gender- and education level-matched control participants were examined to investigate three temporal characteristics of speech: recording time (RT), phonation time (PT) and speech pause...
We propose an efficient method to estimate source power spectral densities (PSDs) in a multi-source reverberant environment using a spherical microphone array. The proposed method utilizes the spatial correlation between the spherical harmonics (SH) coefficients of a sound field to estimate source PSDs. The use of the spatial cross-correlation of the SH coefficients allows us to employ the method...
Multi-channel linear prediction (MCLP) has been shown to be a suitable framework for tackling the problem of blind speech dereverberation. In recent years, a number of adaptive MCLP algorithms have been proposed, whereby the majority operates in the short-time Fourier transform (STFT) domain. In this paper, we focus on the STFT-based Kalman filter solution to the adaptive MCLP task. Similarly to all...
With the increasing stress in working and studying, mental health becomes a major problem in the current social research. Generally, researchers can analyze psychological health states by using social perception behavior. The speech signal is an important research direction in this domain. It objectively assesses the mental health of social groups through the extraction and fusion of speech features...
In this paper, a description of the software is provided that allows to obtain an automated objective quantitative assessment of the patient's speech quality. The evaluation is carried out in the process of speech rehabilitation after surgical treatment of oncological diseases of the organs of the speech-forming tract. The evaluation is obtained based on comparing patient records before the operation...
This paper addresses the problem of speech quality enhancement and acoustic noise reduction by adaptive filtering algorithms. In this paper, we propose a new version of the set-membership partial-update normalized least mean square (SM-PU-NLMS) algorithm. The proposed algorithm is based on the use of the cross-correlation of the output error signal and the noisy to control the variable-step-size in...
This paper describes three methods for multiple fundamental frequencies estimation based on the multi-scale product analysis. The three methods use the autocorrelation of the multi-scale product analysis for the target pitch estimation. For the intrusion pitch, each one has its techniques. The first one uses the classic comb filtering. The second method employs the rectangular comb filter followed...
In this paper, we address the estimation of power spectral density (PSD) matrix. The accurate estimation of PSD matrix plays an important role in many speech enhancement methods. In traditional PSD estimation methods, only the information of previous frames is employed through a forgetting factor. In the current research, we consider the correlation of inter-band components and incorporate their information...
Parkinson's Disease (PD) is a neurodegenerative disorder that is frequently correlated with vowel articulation difficulties. The phonation problem arises in patients affected by PD is commonly known as Parkinsonian Dysarthria and identifiedby vocal signal analysis. The analysis supporte physicians and specialists in early detection and monitoring of dysarthria aiming, to increase patients life quality...
This paper presents a study of how speech features have comparable parameters amongst blood relations. Mel Frequency Cepstral Coefficients (MFCC) has been used for extracting the features of input speech signal, along with vector quantization through modified k-means LBG (Linde, Buzo, and Gray) algorithm are implemented to analyse and estimate the similarity to perform related studies. The study is...
Recently, a multi-frame minimum variance distortionless response (MFMVDR) filter for single-microphone noise reduction has been proposed, which exploits speech correlation across consecutive time frames. It has been shown that the MFMVDR filter achieves impressive results when the speech interframe correlation vector can be accurately estimated. In this paper, we analyze the influence of estimation...
Background noise reduction has been studied for many years. However, unwanted human speech noise suppression is not well discussed due to sparsity of the speech signal. Traditional blind source separation (BSS) methods such as independent component analysis (ICA) assume the prior knowledge of the number of sources and require that the number of sources must equal the number of sensors. Above limitations...
Tracheoesophageal (TE) speech is generated by patients who have undergone a total laryngectomy where the larynx (voice box) is removed and replaced by a tracheoesophageal puncture. This work presents a novel low complexity algorithm to estimate the degree of severity of disordered TE speech. The proposed algorithm uses features which are computed from 32-ms voiced frames of the speech signal. A 21-st...
In recent times, there has been significant interest in the machine recognition of human emotions, due to the suite of applications to which this knowledge can be applied. A number of different modalities, such as speech or facial expression, individually and with eye gaze, have been investigated by the affective computing research community to either classify the emotion (e.g. sad, happy, angry)...
Hypokinetic dysarthria (HD) and freezing of gait (FOG) are frequent symptoms of Parkinson's disease (PD). The aim of this work is to reveal pathological mechanisms common for HD and FOG, and use acoustic analysis of dysarthric speech to assess the gait difficulties in PD. We used a correlation analysis to investigate a relationship between speech features and FOG evaluated by freezing of gait questionnaire...
In this paper, the multidimensional phonological feature structure of Arabic is investigated. Our goal is to assess the performance of statistical and connectionist approaches in performing the complex mappings between distinctive phonetic features (DPF) and associated acoustic cues. The present study explores the mapping between 29 phonological voicing, place, and manner features and Mel-frequency...
Up to 90% of patients with Parkinson's disease (PD) suffer from hypokinetic dysarthria (HD). In this work, we analysed the power of conventional speech features quantifying imprecise articulation, dysprosody, speech dysfluency and speech quality deterioration extracted from a specialized poem recitation task to discriminate dysarthric and healthy speech. For this purpose, 152 speakers (53 healthy...
Feature selection is a crucial step in the development of a system for identifying emotions in speech. How to select high correlation features is an open question. This paper focuses on feature selection method which aims to extract the most effective acoustic features to improve the performance of emotion recognition. Emotional feature selection of speaker-independent speech based on Random Forest...
The article considers the pre-processing voice signals for voice recognition systems based on the use of artificial neural networks. Based segmentation preprocessing is put in the speech signal according to a phonetic transcription of language, in order to reduce the amount of data supplied to the input of the neural network, which considerably improves its input data sensitivity. Application of numerical...
The paper proposes the use of just mostly voiced speech (MVS) for speaker verification (SV). The speech is partitioned into an MVS part and a non-MVS part by a simple machine classification. SV experiments were held with a standard Gaussian mixture model (GMM) with universal background model (UBM) system and a GMM with computationally improved individual background model (IBM) system. They demonstrate...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.