The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper a novel steganographic method, called Hide F0, dedicated to IP telephony is proposed. It is based on the approximation of the parameter that describes the F0 frequency (the pitch) of the speaker's voice. We show that thanks to approximating some fragments of the "fine pitch" parameter in the Speex codec we can create efficient hidden transmission channels. We determined that...
This paper presents a study of VoIP - Quality of Experience (VoIP-QoE) of a well-known VoIP application and three modern ones, Skype, LINE, Tango and Viber, using Perceptual Evaluation of Speech Quality (PESQ) with English speech samples. From this study, it has been found that Skype and LINE tend to provide better VoIP quality than Tango and Viber, particularly when used over good stabile 3G networks...
This paper presents a method for enhancing intelligibility of speech signals using the transient components in the time domain. The Wavelet transform is used for extracting the transient components, and a ramp function with a damped motion models the behavior of the transient components. The wavelet transform of transient component represents a wavelet atom and is calculated by superposition of these...
With wireless acoustic sensor network extending to the services like surveillance of sensitive areas, such as Line of Control, or unmanned terrains, interest in robust, narrowband and low bit rate speech codecs is increasing. This has resulted in a need for evaluation of such codecs. This work investigates the different factors like bit rate, algorithmic delay, implementation, and more importantly...
Many researches have been addressed on design approach for speech enhancement. They are mainly focus on speech quality and intelligibility to produce high performance level of speech signal. Wiener filter is one of the adaptive filter algorithms to adjust filter coefficients and produce an output signal that satisfies some statistical criterion. The objective measures will optimize using informal...
The paper presents speech enhancement schemes for suppression of background noises. Speech enhancement using three methods is presented in this paper. The purpose of speech enhancement is to improve the quality of the processed speech. This paper also investigate the effect on speech intelligibility in all the three enhancement schemes.
We present NISQ, a data-driven non-intrusive speech quality measure that has been trained to predict the PESQ score for a given speech signal. NISQ is based on feature extraction and a binary tree regression based model. A training method using the intrusive PESQ algorithm to automatically label large quantities of speech data is presented and utilized. Our method is shown to predict PESQ with an...
Audio Conferencing is the one of the main features provided by VoIP telecommunication systems. Along with factors such as background noise, low audio level, delay and packet loss, audio mixing algorithm also contributes noise to the output of a audio conferencing system. True mixing algorithm suffers from the problem of overflow / underflow which leads to addition of noise in the form of clipping...
In the audio encoder of Surveillance Video and Audio Coding (SVAC), both audio signals and MEL-frequency cepstral coefficients (MFCCs) are coded and this leads to high computational complexity. This paper proposes a novel scheme for SVAC in which speech coding module based on Algebraic Code Excited Linear Prediction (ACELP) is removed completely and speech waveforms can be reconstructed from MFCCs...
In this paper, we analyze three QoE-based speech quality evaluation models: PESQ, NPESQ and POLQA models. PESQ (Perceptual evaluation of speech quality) is a well known objective speech quality assessment method for speech QoE evaluation. It is formed as the ITU-T P.862 Recommendations. NPESQ (New Perceptual Evaluation of Speech Quality) model is a new objective QoE model on evaluating the speech...
This paper presents a comparison of end-to-end recovery and control methods for the ITU-T G.729 CS-ACELP codec operating at 8 Kbps. We have tested Media independent forward error correction (FEC) parity 3, Media independent FEC parity 4 and Media specific FEC combined with queue management methods namely Drop Tail, Predictive Loss Pattern (PLoP) and Random Early Detection (RED). The performance measures...
The standard ITU-T G.729 coder uses an interframe quantization of the line spectrum frequencies (LSF) parameters which causes error propagation to the next frames. We propose the use of intraframe quantization schemes to overcome this problem. We give a performance comparison between intraframe and interframe quantization methods of LSP parameters for ITU-T G.729. Simulations results show that our...
Performance of traditional speech enhancement techniques like spectral subtraction and log-Minimum Mean Squared Error Short Time Spectral Amplitude (log-MMSE STSA) estimation degrades in presence of highly non-stationary noises like babble noise. This is mainly due to inaccurate noise estimation during the voiced segment of the speech signal. In this paper, we propose to exploit the fine structure...
This paper presents a preliminary study of quality of experience (QoE) of VoIP that are provided as social network services using Perceptual Evaluation of Speech Quality (PESQ) with Thai and Chinese speech samples. This study focuses on VoIP quality of the free call feature from Facebook and LINE, which are the popular social network site and the popular social network application for Thai users respectively...
Maintaining good Quality-of-Experience (QoE) is crucial for Voice-over-IP (VoIP) applications, particularly those operating across the public Internet. Accurate online estimation of QoE as perceived by end users allows VoIP applications take steps to improve QoE when it falls below acceptable levels. ITU-T recommendation G.107 introduced the E-model, which provides a means to assess QoE levels for...
Most of the accurate method for the speech enhancement design mainly focuses on quality and intelligibility to produce high performance level by using compression techniques. A novel speech enhancement algorithm using compressive sensing (CS) is different paradigm from compression technique with low-dimensional geometry for transmission or storage. The CS algorithm, can directly acquire compressed...
Currently, there are more than 500 Quranic recitations available freely on the internet. There is also a growing trends on the use of smart phone compare to traditional desktop PCs for accessing the internet. On such limited device, a high quality speech compression for Quranic recitation is favorable. In this paper, we developed a high quality speech compression for Quranic recitation by modifying...
This paper presents a comparative analysis for enhancement of noisy single channel Hindi speech patterns', using a binary mask threshold function in mother wavelet transforms. In this wavelet transform a three level of wavelet decomposition is used and all three levels are given individually to binary mask threshold for removing noise and enhancing the speech patterns. The suitability of the binary...
This paper addresses a packet loss concealment method (PLC) based on piggybacking to improve speech quality degradation caused by packet losses for code excited linear predictive (CELP) type coders. We applied our proposed scheme to the standard ITU-T G.729 Conjugate-Structure Algebraic CELP (CS-CELP) speech coder to evaluate its performance. The average spectral distortion (Avg. SD), the perceptual...
This paper undertakes a detailed comparative analysis of both PESQ and VISQOL model behaviour, when tested against speech samples modified through playout delay adjustments. The adjustments are typical (in extent and magnitude) to those introduced by VoIP jitter buffer algorithms. Furthermore, the analysis examines the impact of adjustment location as well as speaker factors on MOS scores predicted...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.