The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper shows that pattern classification based on machine learning is a powerful tool to analyze human brain activity data obtained by magnetoencephalography (MEG). We propose a new weighting method using a multiple kernel learning (MKL) algorithm to localize the brain area contributing to the accurate vowel discrimination. Our MKL simultaneously estimates both the classification boundary and...
This paper shows that pattern classification based on machine learning is a powerful tool for analyzing human brain activity data obtained by magnetoencephalography (MEG). In our previous work, a weighting method using multiple kernel learning was proposed, but this method had a high computational cost. In this paper, we propose a novel and fast weighting method using an AdaBoost algorithm to find...
In a real environment, acoustic and language features often vary depending on the speakers, speaking styles and topic changes. This paper focuses on changes in the language environment, and applies a topic tracking model to language model adaptation for speech recognition and topic word extraction for meeting analysis. The topic tracking model can adaptively track changes in topics based on current...
We investigated the speech recognition of a person with articulation disorders resulting from athetoid cerebral palsy. The articulation of speech tends to become unstable due to strain on speech-related muscles, and that causes degradation of speech recognition. Therefore, we use multiple acoustic frames (MAF) as an acoustic feature to solve this problem. Further, in a real environment, current speech...
A large amount of signal processing technology research on applications associated with music is being carried out. Sound synthesis, in particular, is one of the most interesting research themes. In this paper we propose a new approach to mathematically modeling harmonic-timbre structure with multi-beta-distribution (MBD). This probabilistic distribution has the advantage of enabling one to easily...
This paper presents a sound source (talker) localization method using only a single microphone based upon maximum likelihood. In our previous work, we proposed GMM (Gaussian mixture model) separation for estimation of the sound source direction, where the observed (reverberant) speech is separated into the acoustic transfer function and the clean speech GMM, and showed its effectiveness for the single-talker...
Most recent facial expressions recognition systems only work well with frontal face images. However, subjects do not always face front. With this in mind, we propose in this paper a method for pose-robust facial expressions recognition. Active appearance models (AAMs) are used for face tracking to extract pose-robust facial feature points. However, AAM has accuracy problems with face tracking when...
In conventional methods for region segmentation of objects, the best segmentation results have been obtained by semi-automatic or interactive methods that require a small amount of user input. In this study, we propose a new technique for automatically obtaining segmentation of a flower region by using visual attention (saliency maps) as the prior probability in graph cuts. First, AdaBoost determines...
This paper introduces an active microphone concept that achieves a good combination of active-operation and signal processing, where a new sound-source-direction estimation method using only a single microphone with a parabolic reflection board is proposed. A simple signal-power-based method using a parabolic antenna has been proposed in the radar field. But the signal-power-based method is not effective...
In this paper, we propose a method to estimate the 3D human posture from monocular image without using the markers. A 3D human body is expressed by a multi-joint model, and a set of the joint angles describes a posture. The proposed method estimates the posture using histograms of oriented gradients(HOG) feature vectors that can express the shape of the object in the input image obtained from monocular...
In this paper, we propose a method of object recognition and segmentation using scale-invariant feature transform (SIFT) and graph cuts. SIFT feature is invariant for rotations, scale changes, and illumination changes and it is often used for object recognition. However, in previous object recognition work using SIFT, the object region is simply presumed by the affine-transformation and the accurate...
In this paper, we propose an imitation learning framework to generate multiple behaviors with balance control by recognizing human behaviors while estimating the ground reaction force. In our proposed method, a part of captured human motion data is recognized as one particular behavior that is represented by a linear dynamical model. Therefore, our method has small dependence on a classification criteria...
This paper proposes an approach to image segmentation using iterated graph cuts based on local texture features of wavelet coefficient. Using multiresolution analysis based on Haar wavelet, low-frequency range (smoothed image) is used for n-link and high-frequency range (local texture features) is used for t-link along with color histogram. The proposed method can segment the object region with noisy...
In this paper, we propose a method of digital zooming by automatically recognizing the soccer game events such as penalty kick and free kick based on player and ball tracking. We also propose an efficient and stable ball tracking method by switching search methods between global search and local search. In the frames where the ball is lost as well as in the first frame, the global search with normalized...
In our previous work, the use of PCA instead of DCT shows robustness in distorted speech recognition because the main speech element is projected onto low-order features, while the noise or distortion element is projected onto high-order features [1]. This paper introduces a new feature extraction technique that collects the correlation information among phoneme subspaces and their elements are statistically...
This paper introduces a concept of an active microphone that achieves a good combination of active-operation and signal processing, where a new sound-source-direction estimation method using only a single microphone with a parabolic reflection board is proposed. In our previous work [1], we proposed GMM (Gaussian Mixture Model) separation for estimation of the sound source direction, where the observed...
Audio has a key index in digital videos that can provide useful information for video editing, such as capturing conversations only, clipping only talking people, and so on. In this paper, we are studying about video editing based on audio with a two-channel (stereo) microphone that is standard equipment on video cameras, where the video content is automatically recorded without a cameraman. In order...
For a mobile robot to serve people in actual environments, such as a living room or a party room, it must be easy to control because some users might not even be capable of operating a computer keyboard. For nonexpert users, speech recognition is one of the most effective communication tools when it comes to a hands-free (human-robot) interface. This paper describes a new mobile robot with hands-free...
We have already proposed a new feature extraction method based on higher-order local auto-correlation and Fisher weight map (FWM) at Interspeech2006. This paper shows effectiveness of the proposed FWM in speaker dependent and speaker independent phoneme recognition. Widely used MFCC (Mel-frequency cepstrum coefficient) features lack temporal dynamics. To solve this problem, local auto-correlation...
In this paper, we propose an online training-oriented video shooting navigation system focused on camerawork based on video grammar by real-time camerawork evaluation to train users shooting nice shots for the later editing work. In this system, the processing speed must be very high so that we use a luminance projection correlation and a structure tensor method to extract the camerawork parameters...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.