The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper describes a general framework of speaker recognition on summed-channel condition for both enrolling and test data. We present several methods for clustering the target speaker who is involved in multiple summed-channel enrolling excerpts. In our approach, each excerpt is segmented separately by a speaker diarization system as the first stage. Then segments belonging to the same speaker...
In the task of mispronunciation detection, the cross-speaker degradation and some other confusing nuisances are the challenging problems demanding prompt solution. In this paper, we will attempt to remove the non-pronunciation variations in the GLDS-SVM expansion space by using nuisance attribute projection strategy, in order to increase the separating capacity between different phoneme instances...
Joint factor analysis (JFA) has become the state-of-the-art technique in the problem of speaker verification. At the same time, the training of eigenvoice matrix seems to be a heavy burden to us, because it requires lots of multi-channel data, which largely determines the performance of the system. In this paper, we first try to exploit an upper bound performance of the JFA system in a non-normal...
In this paper, a multi-speaker identification system for co-channel speech is proposed. By using constrained likelihood and floating TMR method, this system can identify two speakers on co-channel speech with high accuracy.
Monaural speech separation is one of the most difficult problems in speech signal processing. In this paper, a new method based on machine learning and computational auditory scene analysis (CASA) is suggested to separate the monaural speech of two-talker. The technique of machine learning is used to learn the grouping cues on isolated clean data from single speaker. By using a factorial-max vector...
In this paper, we propose a two-level segmentation method that detects speaker changes in a continuous audio stream effectively. In our approach, we divide the change detection process into two levels: region level that detects the potential change regions containing candidate speaker change points, and boundary level that searches and refines the true change points. At the region level, we employ...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.