The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The study of intelligent agent training is of great interest to the gaming industry due to its wide application in various game genres and its capabilities of simulating a human-like behavior. In this work two machine learning techniques, namely, a reinforcement learning approach and an Artificial Neural Network (ANN), are used in a fighting game in order to allow the agent/fighter to emulate a human...
Strategic conversational agents often need to trade resources with their opponent conversants -- and trading strategically can lead to better results. While rule-based or supervised agents can be used for such a purpose, here we explore a learning approach based on automatically labelled examples from human players for automatic trading in the game of Settlers of Catan. Our experiments are based on...
Here I apply three reinforcement learning methods to the full, continuous action, swing-up acrobot control benchmark problem. These include two approaches from the literature: CACLA and NM-SARSA and a novel approach which I refer to as Nelder Mead-SARSA. Nelder Mead-SARSA, like NMSARSA, directly optimises the state-action value function for action selection, in order to allow continuous action reinforcement...
Protein multiple sequence alignment is significant in the field of bioinformatics as it may reveal important information about the protein sequences' functional, structural or evolutionary relationships. It involves the alignment of three or more biological protein sequences and represents a real challenge both from a biological and a computational point of view. Q-learning is a reinforcement learning...
Applications of reinforcement learning for robotic manipulation often assume an episodic setting. However, controllers trained with reinforcement learning are often situated in the context of a more complex compound task, where multiple controllers might be invoked in sequence to accomplish a higher-level goal. Furthermore, training such controllers typically requires resetting the environment between...
In this paper, a variable admittance controller based on reinforcement learning is proposed for human-robot co-manipulation tasks. Setting as the goal of the reinforcement learning algorithm the minimisation of the jerk throughout a point-to-point movement, the proposed controller can learn the appropriate damping for effective cooperation without any prior knowledge of the target position or other...
Procedural memory and episodic memory are known to be distinct and both underlie the performance of many tasks. Reinforcement learning (RL) and instance-based learning (IBL) represent common approaches to modeling procedural and episodic memory in that order. In this work, we present a neural model utilizing RL dynamics and an ACT-R model utilizing IBL productions to the task of modeling human decision...
We present a model of imitative vocal learning consisting of two stages. First, the infant is exposed to the ambient language and forms auditory knowledge of the speech items to be acquired. Second, the infant attempts to imitate these speech items and thereby learns to control the articulators for speech production. We model these processes using a recurrent neural network and a realistic vocal tract...
Automated license plate recognition (ALPR) has been applied to identify vehicles by their license plates and is critical in several important transportation applications. In order to achieve the recognition accuracy levels typically required in the market, it is necessary to obtain properly segmented characters. A standard method, projection-based segmentation, is challenged by substantial variation...
Controlling mobile robots with complex articulated parts and hence many degrees of freedom generates high cognitive load on the operator, especially under demanding conditions such as in Urban Search & Rescue missions. We propose a solution based on reinforcement learning in order to accommodate the robot morphology automatically to the terrain and the obstacles it traverses. In this paper, we...
This paper reports our learning support system for a human learner to visualize his/her mental learning processes with invisible mazes for continuous learning. The objective of this research is to bring the learning ability of the learning agent close to that of a human. To fill in the missing piece of reinforcement learning whose learning process is mainly behavior change, we add two mental learning...
In this paper, we propose a differential reward based online learning algorithm for classifying web pages into predefined topics based on minimal text available in the URLs. It is then compared with two baseline methods, i.e., Support Vector Machine (SVM) and a state-of-the-art Reinforcement Learning Algorithm using recall, precision and F-measure scores. We conducted experiments on large scale Open...
Constructing the correct Conceptual Graph representing some textual information requires a series of decisions, defined by vertex or edge creation. The process of creating Conceptual Graphs involves semiotics: the semantics, pragmatics and syntactics of the information, as well as graph structuralism and isomorphic projection, all described as decisions of a learning agent or system. The actual process...
Because of great volume of web information, information retrieval process of a search engine is of great importance. For each query of user, the number of queries can reach hundred thousands, whereas a few number of the first results have the chance of being checked by user; therefore, a search engine pays attention to putting relevance results in the first ranks as a necessity. This paper introduces...
Todays, feature selection is an active research in machine learning. The main idea of feature selection is to select a subset of available features, by eliminating features with little or no predictive information. This paper presents a hybrid model with a new local search technique based on reinforcement learning for feature selection. We combined the particle swarm optimization (PSO) with support...
Interactive reinforcement learning constitutes an alternative for improving convergence speed in reinforcement learning methods. In this work, we investigate inter-agent training and present an approach for knowledge transfer in a domestic scenario where a first agent is trained by reinforcement learning and afterwards transfers selected knowledge to a second agent by instructions to achieve more...
Reinforcement learning (RL) is a form of motor learning that robotic therapy devices could potentially manipulate to promote neurorehabilitation. We developed a system that requires trainees to use RL to learn a predefined target movement. The system provides higher rewards for movements that are more similar to the target movement. We also developed a novel algorithm that rewards trainees of different...
In this paper, an adaptive state aggregation Q-Learning method, with the capability of multi-agent cooperation, was proposed to enhance the efficiency of reinforcement learning (RL) and applied to box-pushing tasks for humanoid robots. First, a decision tree was applied to partition the state space according to temporary differences in reinforcement learning, so that a real valued action domain could...
We propose a novel biologically plausible actor-critic algorithm using policy gradients in order to achieve practical, model-free reinforcement learning. It does not rely on backpropagation and is the first neural actor-critic relying only on locally available information. We show it has an advantage over pure policy gradients methods for motor learning performance in the polecart problem. We are...
In this paper we introduce the concept of Adaptive Traversability (AT), which we define as means of autonomous motion control adapting the robot morphology — configuration of articulated parts and their compliance — to traverse unknown complex terrain with obstacles in an optimal way. We verify this concept by proposing a reinforcement learning based AT algorithm for mobile robots operating in such...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.