The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Our research group at Nagoya Institute of Technology is developing “MMDAgent” as a voice interaction toolkit. Using MMDAgent, system developers can create various speech dialogue contents. When developers create voice interaction contents, it is important to consider user needs. Therefore, an approach is necessary to elicit preference information of the user. In this paper, we propose a method to...
In this study, a method for determining the reference points in the time and frequency domains for voice morphing is proposed. Many studies have considered voice morphing out of which many methods require manual determination of the reference points. In this study, we automatically determine the reference points via modified restricted temporal decomposition and a line spectral frequency. The evaluation...
Spoken dialog systems are presently used widely. However, some users avoid using them because of poor usability and unattractiveness. In this study, we develop a system that captures the user's movement and estimates the user state. This function is incorporated into the existing spoken dialog system to build a multimodal dialog system. In the experiments, the recognition performance of body motion...
Captioning lecture speech is very useful for better understanding. However, it takes high cost to do real-time manual captioning or even if we employ automatic speech recognition system and human correction together. In this paper, we propose a method to reduce a cost for human correction as a prerequisite of a framework for captioning using automatic speech recognition system. Specifically, we investigate...
This paper deals with the sound event classification for automatic audio-based surveillance. To improve the performance, we proposed a feature vector combination scheme to use multiple feature vectors simultaneously. Then, the performance is evaluated by using the combination of three segment-based features. The result shows significant amount of improvement compare to the conventional method.
Innovations in the fields of consumer electronics and media technologies are pervading the daily life and invading the private home. There emerge novel communication opportunities. Educational processes such as university studies, for illustration, are unconventionally expanded towards a casual home office. Web browsers and networking technologies provide access to serious content and direct manipulation...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.