The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we propose a novel inverse reinforcement learning algorithm with leveraged Gaussian processes that can learn from both positive and negative demonstrations. While most existing inverse reinforcement learning (IRL) methods suffer from the lack of information near low reward regions, the proposed method alleviates this issue by incorporating (negative) demonstrations of what not to do...
Commercial/Military-Off-The-Shelf (COTS/MOTS) Computer Generated Forces (CGF) packages are widely used in modeling and simulation for training purposes. Conventional CGF packages often include artificial intelligence (AI) interfaces, but lack behavior generation and other adaptive capabilities. We believe Machine Learning (ML) techniques can be beneficial to the behavior modeling process, yet such...
Reinforcement learning is a branch of machine learning that allows an agent to learn to take an action based on its observations and rewards it obtains. In this paper, reinforcement learning agents are trained to play the game of Congklak, a traditional game from Indonesia and Malaysia. Congklak is a deterministic board game played by 2 players which play in turns. However, it was found that the common...
With the availability of more data, classification is increasingly important. However, traditional classification algorithms do not scale well to large data sets and are often not suited when only limited samples of the dataset are available at any point in time. The latter arises, for example, in streaming data when the accumulation of data a priori is infeasible either due to limitations in memory...
Serious games receive increasing interest in the area of e-learning. Their development, however, is often still a demanding, specialized and arduous process, especially when regarding reasonable non-player character behaviour. Reinforcement learning and, since recently, also deep reinforcement learning have proven to automatically generate successful AI behaviour to a certain degree. These methods...
Recently, there is a growing interest in applying deep learning in game AI domain. Among them, deep reinforcement learning is the most famous in game AI communities. In this paper, we propose to use redundant outputs in order to adapt training progress in deep reinforcement learning. We compare our method with general ε-greedy in ViZDoom platform. Since AI player should select an action only based...
Real-time strategy (RTS) games, such as Blizzard's StarCraft, are fast paced war simulation games in which players have to manage economies, control many dozens of units, and deal with uncertainty about opposing unit locations in real-time. Even in perfect information settings, constructing strong AI systems has been difficult due to enormous state and action spaces and the lack of good state evaluation...
Deep Q-Learning is an effective reinforcement learning method, which has recently obtained human-level performance for a set of Atari 2600 games. Remarkably, the system was trained on the high-dimensional raw visual data. Is Deep Q-Learning equally valid for problems involving a low-dimensional state space? To answer this question, we evaluate the components of Deep Q-Learning (deep architecture,...
This paper proposes an application of reinforcement learning and position-based features in rollout bias training of Monte-Carlo Tree Search (MCTS) for General Video Game Playing (GVGP). As an improvement on Knowledge-based Fast-Evo MCTS proposed by Perez et al., the proposed method is designated for both the GVG-AI Competition and improvement of the learning mechanism of the original method. The...
Simulation Balancing is an optimization algorithm to automatically tune the parameters of a playout policy used inside a Monte Carlo Tree Search. The algorithm fits a policy so that the expected result of a policy matches given target values of the training set. Up to now it has been successfully applied to Computer Go on small 9 × 9 boards but failed for larger board sizes like 19 × 19. On these...
Reinforcement learning is an effective algorithm for brain machine interfaces (BMIs) which interprets the mapping between neural activities with plasticity and the kinematics. Exploring large state-action space is difficulty when the complicated BMIs needs to assign credits over both time and space. For BMIs attention gated reinforcement learning (AGREL) has been developed to classify multi-actions...
Misdosing medications with sensitive therapeutic windows, such as heparin, can place patients at unnecessary risk, increase length of hospital stay, and lead to wasted hospital resources. In this work, we present a clinician-in-the-loop sequential decision making framework, which provides an individualized dosing policy adapted to each patient's evolving clinical phenotype. We employed retrospective...
Online Social Networks (OSNs) remain the focal point of Internet usage. Since the beginning, networking sites tried best to have right privacy mechanisms in place for users, enabling them to share the right content with the right audience. With all these efforts, privacy customizations remain hard for users across the sites. Existing research that address this problem mainly focus on semi-supervised...
A method for hybridizing supervised learning with adaptive dynamic programming was developed to increase the speed, quality, and robustness of on-line neural network learning from an imperfect teacher. Reinforcement learning is used to modify and enhance the original supervisory signal before learning occurs. This paper describes the method of hybridization and presents a model problem in which a...
A large number of videos are generated and uploaded to video websites (like youku, youtube) every day and video websites play more and more important roles in human life. While bringing convenience, the big video data raise the difficulty of video summarization to allow users to browse a video easily. However, although there are many existing video summarization approaches, the key frames selected...
ELM (extreme learning machine) algorithm has the advantages of fast learning speed, good generalization performance. It is not only suitable for regression, fitting problem, but also applies to the field of classification and pattern recognition. In this paper, ELM algorithm is applied to nonlinear function fitting. The performance and running speed with other algorithms are comparison, show the superiority...
In view of high dimension, the difficulty of training, the problem of slow learning speed in the application of BP neural network in mobile robot path planning, an algorithm of reinforcement Q learning based on extreme learning machine (Q-ELM algorithm) is proposed in this paper. Firstly, the characteristic of reinforcement learning is combining the dynamic network with supervised learning, and the...
In a pursuit-evasion game, the pursuer learning its strategy by any learning algorithm usually captures the evader when the environment of the game is similar to the environment that the pursuer was trained on. However, the trained pursuer may not be able to capture the evader if the environment of the pursuit-evasion game is different from the training environment. In this paper, we propose a fuzzy...
In this paper, a group of mobile robots learns to solve a target reaching problem in a simulated grid environment filled with obstacles. Each robot knows its distance to the target and can communicate with each other. The proposed learning algorithm combines a reinforcement learning algorithm and a swarm optimization algorithm. Q-learning, which is a reinforcement learning algorithm, is modified to...
When reinforcement learning is applied to large state spaces, such as those occurring in playing board games, the use of a good function approximator to learn to approximate the value function is very important. In previous research, multi-layer perceptrons have often been quite successfully used as function approximator for learning to play particular games with temporal difference learning. With...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.