The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We consider the task of multi-view subspace learning which integrates multi-view information to learn a unified representation for multimedia data. In real-world scenarios, we encounter views with high diversities of semantic levels. Neglecting the problem of semantic inconsistency, existing graph-based methods directly convert heterogeneous information into local affinity matrices to conduct a fusion...
This paper compares the use of signal to noise ratio (SNR)-dependent and SNR-independent mixtures of probabilistic linear discriminant analysis (PLDA) versus conventional PLDA, under multi-noise and multi-SNR conditions for a small-set speaker verification system. Results indicate that conventional PLDA is more robust under multi-SNR conditions. The effect of the testing speech length is also examined...
It is commonly accepted that one of the most important factors for assuring the high performance of an electrical network is the surveillance and the respective preventive maintenance. From a long time ago that TSOs and DSOs incorporate in their maintenance plans the surveillance of the grid, where is included the aerial power lines inspection. Those inspections started by human patrol, including...
Recordings of read-aloud stories by children in a school setting can be used to provide an assessment of reading skills via automatic speech recognition (ASR). ASR, however, is known to be highly susceptible to background noise. The unusual variety of foreground (breath release, mic pops, etc.) and background (children playing, distinct background talker, wind, etc.) non-speech sounds makes this application...
One of the major reasons for the performance degradation of a speaker verification (SV) system in real-world conditions is its inability to spot speech regions due to the presence of noise. This work focuses on the role of voice activity detection (VAD) methods in alleviating such shortcomings. The experiments are conducted on the core-core task of the speakers in the wild (SITW) challenge. Two VAD...
Accurate and cost-effective localization is an important requirement for several sensor network applications. In this paper, we compare two localization methods-multilateration and Isomap and study some of the key issues that affect their performance. We also analyze the flip ambiguity problem and the effect of applying a robustness criterion in both the methods. Our simulation results show that the...
Fault detection of nonlinear systems become more feasible when it is conducted over Takagi-Sugeno (TS) approximated fuzzy models. Proportional plus integral observer (PIO) and robust observer (RO) have already been developed for the estimation of the system states and actuator/sensor faults. In this paper, the algorithms are implemented for the detection of valve and level sensor faults of a two-tank...
In this paper; we propose new method named local full-directional pattern (LFDP) for content-based image retrieval (CBIR). In addition, instead of applying the algorithm to the image itself, we apply it to a new image constructed by getting mean of 3×3 sub-regions gray value as each pixel's value. In local binary patter (LBP) the gray value difference of the central pixel and its neighboring pixels...
We study the problem of low-rank and sparse decomposition from possibly noisy observations. We propose a novel objective function with nuclear norm on the low-rank term and ℓ0-‘norm’ on the sparse term, as well as ℓ1-norm on the additive noise term. When there is no dense inlier noise, the proposed method shares the same theoretical guarantee as the Principal Component Pursuit (PCP), i.e., it can...
State filtering is a key problem in many signal processing applications. From a series of noisy measurement, one would like to estimate the state of some dynamic system. Existing techniques usually adopt a Gaussian noise assumption which may result in a major degradation in performance when the measurements are with the presence of outliers. A robust algorithm immune to the presence of outliers is...
Universal compressed sensing algorithms recover a “structured” signal from its under-sampled linear measurements, without knowing its distribution. The recently developed minimum entropy pursuit (MEP) optimization suggests a framework for developing universal compressed sensing algorithms. In the noiseless setting, among all signals that satisfy the measurement constraints, MEP seeks the “simplest”...
This paper presents a novel noise suppression method to enhance soft speech recorded with a special body-conductive microphone called nonaudible murmur (NAM) microphone. NAM microphone is capable of detecting extremely soft speech, but the recorded soft speech easily suffers from external noise due to its faint volume. To effectively suppress noise on the body-conducted signals, an external noise...
It often happens that we are interested in reconstructing an unknown signal from partial measurements. Also, it is typically assumed that the location (temporal or spatial) of each sample is known and that the only distortion present in the observations is due to additive measurement noise. However, there are some applications where such location information is lost. In this paper, we consider the...
With the explosion in the availability of user-generated videos documenting any conflicts and human rights abuses around the world, analysts and researchers increasingly find themselves overwhelmed with massive amounts of video data to acquire and analyze useful information. In this paper, we develop a temporal localization framework for intense audio events in videos which addresses the problem....
In the era of social media, a large number of user-generated videos are uploaded to the Internet every day, capturing events all over the world. Reconstructing the event truth based on information mined from these videos has been an emerging challenging task. Temporal alignment of videos “in the wild” which capture different moments at different positions with different perspectives is the critical...
Beamforming algorithms in binaural hearing aids are crucial to improve speech understanding in background noise for hearing impaired persons. In this study, we compare and evaluate the performance of two recently proposed minimum variance (MV) beamforming approaches for binaural hearing aids. The binaural linearly constrained MV (BLCMV) beamformer applies linear constraints to maintain the target...
A linear-time algorithm termed SPARse Truncated Amplitude flow (SPARTA) is developed for the phase retrieval (PR) of sparse signals. Upon formulating the sparse PR as a non-convex empirical loss minimization task, SPARTA emerges as an iterative solver consisting of two components: s1) a sparse orthogonality-promoting initialization leveraging support recovery and principal component analysis; and,...
We study the Nonnegative Matrix Factorization problem which approximates a nonnegative matrix by a low-rank factorization. This problem is particularly important in Machine Learning, and finds itself in a large number of applications. Unfortunately, the original formulation is ill-posed and NP-hard. In this paper, we propose a row sparse model based on Row Entropy Minimization to solve the NMF problem...
Since the introduction of deep neural network (DNN)-based acoustic model, robust automatic speech recognition using DNN are being in research. Especially in model adaptation, the techniques utilizing auxiliary context features is known to be a promising technique. Recently, we proposed a technique which is called two-stage noise-aware training (TSNAT). The key idea of TS-NAT is to let the DNN clarify...
We present the performance of three popular image feature extraction methods such as Scale Invariant Feature Transformation (SIFT), Speeded-Up Robust Features (SURF) and Histogram of Oriented Gradient (HOG). Specifically, we compare the performance of feature detection methods for images corrupted with different types of noise. The efficiency of three methods are measured by considering number of...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.