The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Non-negative matrix factorization (NMF) is a popular method for learning interpretable features from non-negative data, such as counts or magnitudes. Different cost functions are used with NMF in different applications. We develop an algorithm, based on the alternating direction method of multipliers, that tackles NMF problems whose cost function is a beta-divergence, a broad class of divergence functions...
In this paper we propose a non-negative matrix factorization (NMF) model with piecewise-constant activation coefficients. This structure is enforced using a total variation penalty on the rows of the activation matrix. The resulting optimization problem is solved with a majorization-minimization procedure. The proposed algorithm is well suited to analyze data explained by underlying piecewise-constant...
This paper seeks to exploit high-level temporal information during feature extraction from audio signals via non-negative matrix factorization. Contrary to existing approaches that impose local temporal constraints, we train powerful recurrent neural network models to capture long-term temporal dependencies and event co-occurrence in the data. This gives our method the ability to “fill in the blanks”...
Non-negative matrix factorization (NMF) has emerged as a promising approach for single-channel speech separation. In this paper, we propose a new method of discriminative learning of NMF. In contrast to conventional approaches where the basis vectors are learned independently on clean signals from each speaker, our approach optimizes all basis vectors jointly to reconstruct both clean signals and...
In this paper, we present a new method to perform underdetermined audio source separation using a spoken or sung reference signal to inform the separation process. This method explicitly models possible differences between the spoken reference and the target signal, such as pitch differences and time lag. We show that the proposed algorithm outperforms state-of-the art methods.
Note onset detection and instrument recognition are two of the most investigated tasks in Music Information Retrieval (MIR). Various detection methods have been proposed in previous research for western music, with less focus on other music cultures of the world. In this paper, we focus on onset detection for percussion instruments in Beijing Opera, a major genre of Chinese traditional music. A dataset...
This paper examines complex non-negative matrix factorization (CMF) as a tool for separating overlapping partials in mixtures of harmonic musical sources. Unlike non-negative matrix factorization (NMF), CMF allows for the development of source separation procedures founded on a mixture model rooted in the complex-spectrum domain (in which the superposition of overlapping sources is preserved). This...
This paper studies multichannel audio separation using non-negative matrix factorization (NMF) combined with a new model for spatial covariance matrices (SCM). The proposed model for SCMs is parameterized by source direction of arrival (DoA) and its parameters can be optimized to yield a spatially coherent solution over frequencies thus avoiding permutation ambiguity and spatial aliasing. The model...
This paper presents a multimodal voice conversion (VC) method for noisy environments. In our previous NMF-based VC method, source exemplars and target exemplars are extracted from parallel training data, in which the same texts are uttered by the source and target speakers. The input source signal is then decomposed into source exemplars, noise exemplars obtained from the input signal, and their weights...
Model-based speech enhancement methods, which rely on separately modeling the speech and the noise, have been shown to be powerful in many different problem settings. When the structure of the noise can be arbitrary, which is often the case in practice, modelbased methods have to focus on developing good speech models, whose quality will be key to their performance. In this study, we propose a novel...
Speech enhancement based on statistical models has shown good performance, but the performance degrades when environment noise is highly non-stationary due to the stationary assumption. On the contrary, the template-based enhancement methods are more robust to non-stationary noise, but these are heavily dependent on a priori information present in training data. In order to get over both of the shortcomings,...
We propose a new algorithm to efficiently obtain non-negative sparse representations for audio. The spectrum of an audio signal is represented as a sparse linear combination of atoms taken from an overcomplete dictionary. The algorithm is based on minimizing the generalized Kullback-Leibler divergence between an observed magnitude spectrum and a non-negative linear combination of atoms, plus an ℓ1...
The exemplar-based approaches, which model signals as a sparse linear combination of exemplars of signals, are proved to have state-of-the-art performance in noise robust ASR, especially on low SNRs. However, since both the speech exemplars and noise exemplars are built from training data and are fixed throughout the process of enhancing speech features, the conventional approach is especially weak...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.