The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper presents our newly developed real-time meeting analyzer for monitoring conversations in an ongoing group meeting. The goal of the system is to automatically recognize “who is speaking what” in an online manner for meeting assistance. Our system continuously captures the utterances and the face pose of each speaker using a distant microphone array and an omni-directional camera at the center...
We present our newly developed real-time conversation analyzer for group meetings. The goal of the system is to estimate automatically “who speaks when and what” in an online manner. In our system, “who speaks when” information is first obtained by estimating the directions of arrival (DOAs) of signals. Then, “who speaks what” is estimated with our automatic speech recognition (ASR) system, after...
This paper addresses the problem of voice activity detection (VAD) in noisy environments. The VAD method proposed in this paper integrates multiple speech features and a signal decision scheme, namely the speech periodic to aperiodic component ratio and a switching Kalman filter. The integration is carried out by using the weighted sum of likelihoods outputted from each VAD (stream). The stream weight...
This paper addresses a speech recognition problem in non-stationary noise environments: the estimation of noise sequences. To solve this problem, we present a particle filter-based sequential noise estimation method for the front-end processing of speech recognition. In the proposed method, the particle filter is defined by a dynamical system based on Polyak averaging and feedback. We also introduce...
In this paper, a personal digital assistant (PDA) for hands-free speech recognition and communication with a microphone array mounted on the PDA is presented. An outlier-robust generalized sidelobe canceller (RGSC) and a minimum mean-squared error (MMSE) estimator for log Mel-spectral energy coefficients using a Gaussian mixture model (GMM) for clean speech are implemented in real-time and evaluated...
This paper addresses a speech recognition problem in non-stationary noise environments: the estimation of noise sequences. To solve this problem, we present a particle filter-based sequential noise estimation method for front-end processing of speech recognition in noise. In the proposed method, a noise sequence is estimated in three stages: a sequential importance sampling step, a residual resampling...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.