The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Clustering for better representation of the diversity of text or image search results has been studied extensively. In this paper, we extend this methodology to the novel domain of music search. We conduct empirical evaluation of different clustering algorithms, audio feature representations, and the incorporation of lyrics for music clustering. Our evaluation shows the fusion of audio and text features...
The main objective of this paper is to explore the effectiveness of feature selection for performing composite speaker identification/verification. We propose features such as line spectral frequency (LSF), differential line spectral frequency (DLSF), mel frequency cepstral coefficients (MFCC), discrete cosine transform cepstrum (DCTC), perceptual linear predictive cepstrum (PLP) and mel frequency...
The main objective of this paper is to explore the effectiveness of features for identifying speakers. We propose features such as line spectral frequency (LSF), differential line spectral frequency (DLSF), mel frequency cepstral coefficients (MFCC), discrete cosine transform cepstrum (DCTC), perceptual linear predictive cepstrum (PLP) and mel frequency perceptual linear predictive cepstrum (MF-PLP)...
We investigate the symmetric Kullback-Leibler (KL2) distance in speaker clustering and its unreported effects for differently-sized feature matrices. Speaker data is represented as Mel frequency cepstral coefficient (MFCC) vectors, and features are compared using the KL2 metric to form clusters of speech segments for each speaker. We make two observations with respect to clustering based on KL2: 1...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.