The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we investigate the effect of the G.723.1 (6.3kbps) on speaker recognition system. In order to improve the robustness of codec mismatch, we used the Power Normalized Cepstral Coefficients (PNCC) which is a new robustness acoustic feature, to improve the performance of speaker verification system. And a modified SCF speech feature is propose to improve the robustness under codec mismatch...
This paper investigates the robustness of two state-of-theart action recognition algorithms: a pixel domain approach based on 3D convolutional neural networks (C3D) and a compressed domain approach requiring only partial decoding of the video, based on feature description using motion vectors and Fisher vector encoding (MV-FV). We study the robustness of the two algorithms against: (i) quality variations,...
This paper reveals the potential gain in audio quality that can be achieved by combining Spherical Logarithmic Quantization (SLQ) with advanced broadband error robust low delay audio coding based on ADPCM. We briefly summarize the basic properties and mechanisms of SLQ and the employed ADPCM scheme and show how they can be combined in a freely parameterizable coding algorithm. The resulting codec...
Up to now, many existing video transmission and storage infrastructures are not able to handle UHD uncompressed video in real-time. For instance, the transmission of 4K UHD 4:2:2 10 bits 60p requires approximatively 4 times the bandwidth available on a 3-G SDI cable. To reduce the required bitrates, a low-latency lightweight compression scheme is needed. To this end, several standardization efforts...
This paper discusses the voice and audio quality characteristics of EVS, the recently standardized 3GPP codec. Especially frame erasure conditions were evaluated. Comparison to industry standard voice codecs: 3GPP AMR and AMR-WB as well as direct signals at varying bandwidths was made. Speech quality was evaluated with two subjective listening tests containing clean and noisy speech in Finnish language...
This paper addresses the combination of a low delay subband ADPCM-based audio codec with adaptive pre- and post-filtering for psychoacoustic noise shaping. We present how our basic scheme for error robust subband coding can be combined with two cascades of band shelving filters. The gain parameters of these filters are adapted by an algorithm that is based on power estimates which are obtained from...
This paper proposes the unification of the codeexcited linear prediction (CELP) codec process with watermarking based on formant tuning. The serial problem in atermarking and then encoding with the CELP codec was thereby reduced by using the proposed method which also ncreased the bit detection rate. We took advantage of two key properties: I) humans do not perceive alterations applied to formants...
Audio hashes are compact and robust representations of audio data and allow the efficient identification of specific recordings and their transformations. Audio hashing for music identification is well established and similar algorithms can also be used for speech data. A possible application is the identification of replayed telephone spam. This contribution investigates the security and privacy...
In this paper we describe the construction and performance of classifiers able to identify Variable Rate VoIP traffic flows rapidly, reliably and independently of the application version that generated it. We show that features calculated on short sequences of packets extracted from the flow (sub-flows) are sufficient to identify VoIP flows with Recall of 99% and Precision of 90%. The features we...
There have recently been serious social issues involved in multimedia signal processing such as malicious attacks and tampering with digital audio/speech signals. Fragile speech watermarking is a technique that enables the detection of tampering with the original signals. We previously proposed an inaudible digital-audio watermarking approach based on cochlear delay. We investigated how the proposed...
In this paper, a time-domain audio watermarking scheme is proposed where embedding is done in two different marking spaces which are obtained from the host audio by exploiting the properties of Polar coordinate system. This technique has the advantage of higher embedding capacity due to its double utilization of the same set of audio samples during insertion of watermark message. Simulation results...
In this paper, we utilize sender-based Forward Error Correction (FEC) techniques to enhance the robustness of packet loss recovery for AVS Mobile speech and audio (AVS-M) codec. Two FEC schemes are proposed which take the advantage of the codec's structure characteristics and do not introduce extra delay. The objective and subjective listening tests results show that the two methods achieve higher...
A practical error-resilient M-description codec scheme is designed to combat the bit errors of the wireless broadcasting networks and raise the quality of the reconstructed signal. The signal is coded into large number of mutually refinable descriptions by robust staggered M-description scalar quantizer (RSMDSQ). Then an index assignment method is used to enhance the error-resilient capacity of any...
In this paper we present a method to alleviate the performance degradation led by non-ideal channel reciprocity in TDD downlink base station (BS) cooperative transmission systems, which comes from imperfect antenna calibration among BSs. By exploiting the statistics of the ambiguity factors between uplink and downlink channels, a robust multiuser precoder is proposed aimed at maximizing the lower...
A technology for aerial transmission of acoustic data which is robust against background noise in reverberant spaces is proposed. Hidden data are encoded as complex tones whose fundamental frequencies correspond to the chromatic scale. The decoding of hidden data is based on a pitch extraction algorithm that is employed in a CELP-based speech codec. Computer simulations revealed that the average bit...
This paper proposes to employ error statistics of nanoscale circuit fabrics to design robust energy-efficient digital signal processing (DSP) systems. Architectural level error statistics are exploited to generate probability or the reliability of each output bit of a DSP kernel. The proposed technique is referred to here as bit-level a posteriori probability processing (BLAPP). Energy efficiency...
Speech quality is an important measurement for performance evaluation in a wireless mobile communication system since voice is still the most used service on it. The speech quality evaluation in GSM system employing narrowband and wideband AMR codecs with Orthogonal Sub Channel (OSC) technique is addressed in this paper. OSC is a feature proposed in 3GPP GERAN to double circuit switched capacity in...
The existing image compression methods (e.g., JPEG2000, etc.) are vulnerable to bit-loss, and this is usually tackled by channel coding that follows. However, source coding and channel coding have conflicting requirement. In this paper, we address the problem with an alternative paradigm, and a novel compressive sensing (CS) based compression scheme is therefore proposed. Discrete wavelet transform...
VoIP implementations are nowadays the preferred information technology alternative to public switched telephone networks. With dependence on this technology, VoIP quality and performance are critical. In this paper, we implement some commonly used VoIP CODECs on Windows desktop operating systems to evaluate their performance on two versions on IP, namely IPv4 and IPv6. Performance related metrics...
Distributed video coding (DVC) has recently been proposed to reduce the complexity of the encoder, whereas it suffers from the sampling cost of huge amount of image data. To relax such sampling burden, this paper develops a novel sub-sampling distributed video coding (SuDVC) by utilizing compressive sensing (CS) technique. Due to the inherent sparsity in video sources, the video frames are compressively...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.