The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
To generate the visual codebook, a step of quantization process is obligatory. Several works have proved the efficiency of sparse coding in feature quantization process of BoW based image representation. Furthermore, it is an important method which encodes the original signal in a sparse signal space. Yet, this method neglects the relationships among features. To reduce the impact of this issue, we...
The current sparse representation framework is to decouple it as two subproblems, i.e., alternate sparse coding and dictionary learning using different optimizers, treating elements in bases and codes separately. In this paper, we treat elements both in bases and codes ho-mogenously. The original optimization is directly decoupled as several blockwise alternate subproblems rather than above two. Hence,...
Bagging is one the most classic ensemble learning techniques in the machine learning literature. The idea is to generate multiple subsets of the training data via bootstrapping (random sampling with replacement), and then aggregate the output of the models trained from each subset via voting or averaging. As music is a temporal signal, we propose and study two bagging methods in this paper: the inter-song...
To detect violence in a video, a common video description method is to apply local spatio-temporal description on the query video. Then, the low-level description is further summarized onto the high-level feature based on Bag-of-Words (BoW) model. However, traditional spatio-temporal descriptors are not discriminative enough. Moreover, BoW model roughly assigns each feature vector to only one visual...
This paper presents a novel feature representation called sparse cepstral codes for instrument identification. We first motivate the approach by discussing why cepstrum is suitable for instrument identification. Then we propose the use of sparse coding and power normalization to derive compact codes that better represent the information of the cepstrum. Our evaluation on both uni-source and multi-source...
In this work we focus on the problem of estimating time-varying sparse signals from a sequence of under-sampled observations. We formulate this problem as estimating hidden states in a dynamic model and exploit the underlying temporal structure to find a more accurate solution, particularly when the information in the observations is at scarce. We propose an optimization procedure based on smoothing...
When applying sparse representation techniques to images, the standard approach is to independently compute the representations for a set of overlapping image patches. This method performs very well in a variety of applications, but the independent sparse coding of each patch results in a representation that is not optimal for the image as a whole. A recent development is convolutional sparse coding,...
Analysis sparsity and the accompanying analysis operator learning problem provide an important framework for signal modeling. Very recently, sparsifying transform learning has been put forward as an effective and new formulation for the analysis operator learning problem. In this study, we develop a new sparsifying transform learning algorithm by using the uniform normalized tight frame constraint...
Representing music information using audio codewords has led to state-of-the-art performance on various music classifcation benchmarks. Comparing to conventional audio descriptors, audio words offer greater fexibility in capturing the nuance of music signals, in that each codeword can be viewed as a quantization of the music universe and that the quantization goes fner as the size of the dictionary...
We propose a new algorithm to efficiently obtain non-negative sparse representations for audio. The spectrum of an audio signal is represented as a sparse linear combination of atoms taken from an overcomplete dictionary. The algorithm is based on minimizing the generalized Kullback-Leibler divergence between an observed magnitude spectrum and a non-negative linear combination of atoms, plus an ℓ1...
Sparse dictionary learning has attracted enormous interest in image processing and data representation in recent years. To improve the performance of dictionary learning, we propose an efficient block-structured incoherent K-SVD algorithm for the sparse representation of signals. Without relying on any prior knowledge of the group structure for the input data, we develop a two-stage agglomerative...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.