The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
A robot working with humans or other robots is supposed to be adaptive to changes in the environment. Reinforcement learning has been studied well for motor skill learning, robot behavior acquisition and adaptation of the behavior to the environmental changes. However, it is not practical that the robot learns and adapts its behavior only through trial and error by itself from scratch because huge...
Estimation of a caregiver's view is one of the most important capabilities for a child to understand the behavior demonstrated by the caregiver, that is, to infer the intention of behavior and/or to learn the observed behavior efficiently. We hypothesize that the child develops this ability in the same way as behavior learning motivated by an intrinsic reward, that is, he/she updates the model of...
The nursing care quality improvement is very important for our life. Currently, nursing-care freestyle texts (nursing-care data) are collected from many hospitals in Japan by using Web applications. The collected nursing-care data are stored into the database. To evaluate nursing-care data, we have already proposed a fuzzy classification system, a neural network based system, a support vector machine...
Both self-learning architecture (embedded structure) and explicit/implicit teaching from other agents (environmental design issue) are necessary not only for one behavior learning but more seriously for life-time behavior learning. This paper presents a method for a robot to understand unfamiliar behavior shown by others through the collaboration between behavior acquisition and recognition of observed...
Life-time development of behavior learning seems based on not only self-learning architecture but also explicit/implicit teaching from other agents that is expected to accelerates the learning. This paper presents a method for a robot to understand unfamiliar behaviors shown by others through the collaboration between behavior acquisition and recognition of observed behaviors, where the state value...
The existing reinforcement learning approaches have been suffering from the curse of dimension problem when they are applied to multiagent dynamic environments. One of the typical examples is a case of RoboCup competitions since other agents and their behaviors easily cause state and action space explosion. This paper presents a method of modular learning in a multiagent environment by which the learning...
Reinforcement learning is a promising approach to realize intelligent agent such as autonomous mobile robots. In order to apply the reinforcement learning to actual sized problem, the "curse of dimensionality" problem in partition of sensory states should be avoided maintaining computational efficiency. The paper describes a hierarchical modular reinforcement learning that Profit Sharing...
We propose a novel approach for acquisition and development of behaviors through observation in multi-agent environment. Observed behaviors of others give fruitful hints for a learner to find a new situation, a new behavior for the situation, necessary information for the behavior acquisition. RoboCup scenario gives us a good test-bed multi-agent environment where a learner can observe behaviors of...
This paper presents a series of the studies of decomposing the large state/action space at the bottom level into several subspaces and merging those subspaces at the higher level. This allows the system to maintain computational resources assigned to the modules compact and small, to reuse the policies learned before, and therefore to avoid the curse of dimension. To show the validity of the proposed...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.