Search results

Items from 1 to 20 out of 20 results

chapter

Structure and Parameter Learning of CDHMM Based on Reduction

A. Vafaei, M. Miralipoor

2009 Fourth International Conference on Frontier of Computer Science and Technology > 441 - 445

2009 Fourth International Conference on Frontier of Computer Science and Technology (FCST 2009)

This paper introduces an algorithm based on MLE to learn the structure and parameters of CDHMM (Continuous Density HMM). One of the most cumbersome troubles encountered in applications that incorporates HMM as a model, is guessing the required number of states and the entire structure especially when sources of information is continuous and variable (e.g. speech). In our algorithm, induction steps...

chapter

Theoretical and Empirical Analysis of Reward Shaping in Reinforcement Learning

M. Grzes, D. Kudenko

2009 International Conference on Machine Learning and Applications > 337 - 344

Eighth International Conference on Machine Learning and Applications (ICMLA 2009)

Reinforcement learning suffers scalability problems due to the state space explosion and the temporal credit assignment problem. Knowledge-based approaches have received a significant attention in the area. Reward shaping is a particular approach to incorporate domain knowledge into reinforcement learning. Theoretical and empirical analysis of this paper reveals important properties of this principle,...

chapter

Learning to generalize and reuse skills using approximate partial policy homomorphisms

S. Rajendran, M. Huber

2009 IEEE International Conference on Systems, Man and Cybernetics > 2239 - 2244

2009 IEEE International Conference on Systems, Man and Cybernetics. SMC 2009

A reinforcement learning (RL) agent that performs successfully in a complex and dynamic environment has to continuously learn and adapt to perform new tasks. This necessitates for them to not only extract control and representation knowledge from the tasks learned, but also to reuse the extracted knowledge to learn new tasks. This paper presents a new method to extract this control and representational...

chapter

A reinforcement learning model using macro-actions in multi-task grid-world problems

H. Onda, S. Ozawa

2009 IEEE International Conference on Systems, Man and Cybernetics > 3088 - 3093

2009 IEEE International Conference on Systems, Man and Cybernetics. SMC 2009

A macro-action is a typical series of useful actions that brings high expected rewards to an agent. Murata et al. have proposed an actor-critic model which can generate macro-actions automatically based on the information on state values and visiting frequency of states. However, their model has not assumed that generated macro-actions are utilized for leaning different tasks. In this paper, we extend...

chapter

On the likelihood function of HMMs for a long data sequence

K. Yamazaki

2009 IEEE International Workshop on Machine Learning for Signal Processing > 1 - 6

2009 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2009)

Hidden Markov models (HMMs) are widely applied to the analysis of time-dependent data sequences, such as nonlinear signal processing, natural language processing, and bioinformatics. Training data in HMMs have two possible formats: a large set of time-dependent sequential data and an infinitely long sequence. The learning process is one of the main concerns in machine learning. For a large set of...

chapter

An Analysis and Hierarchical Decomposition for HAMs

Du Xiaoqin, Li Qinghua, Han Jianjun

2009 International Conference on Computational Science and Engineering > 2 > 1050 - 1054

2009 International Conference on Computational Science and Engineering (CSE)

In the HRL field, there are several main methods such as HAMs, options, MAXQ. These methods all rely on the theory of SMDPs. However, SMDPs does not specify how the overall task can be decomposed into a collection of subtasks. This paper introduces the concept of ldquopolicy-coupledrdquo SMDPs into HAMs. It defines the concept of HAM-decomposable and makes the relations among the HAM machine, HAM-decomposable,...

chapter

Adaptive action selection using utility-based reinforcement learning

Kunrong Chen, Fen Lin, Qing Tan, Zhongzhi Shi

2009 IEEE International Conference on Granular Computing > 67 - 72

2009 IEEE International Conference on Granular Computing (GrC 2009)

A basic problem of intelligent systems is choosing adaptive action to perform in a non-stationary environment. Due to the combinatorial complexity of actions, agent cannot possibly consider every option available to it at every instant in time. It needs to find good policies that dictate optimum actions to perform in each situation. This paper proposes an algorithm, called UQ-learning, to better solve...

chapter

Generalization Performance of ERM Algorithm with Geometrically Ergodic Markov Chain Samples

Jie Xu, Bin Zou, Jianjun Wang

2009 Fifth International Conference on Natural Computation > 1 > 154 - 158

2009 Fifth International Conference on Natural Computation (ICNC 2009)

The previous works describing the generalization ability of learning algorithms are based on independent and identically distributed (i.i.d.) samples. In this paper we go far beyond this classical framework by studying the learning performance of the empirical risk minimization (ERM) algorithm with Markov chain samples. We obtain the bound on the rate of uniform convergence of the ERM algorithm with...

chapter

Indirect Reinforcement Learning for Autonomous Power Configuration and Control in Wireless Networks

A. Udenze, K. McDonald-Maier

2009 NASA/ESA Conference on Adaptive Hardware and Systems > 297 - 304

2009 NASA/ESA Conference on Adaptive Hardware and Systems. AHS 2009

In this paper, non deterministic Indirect Reinforcement Learning (RL) techniques for controlling the transmission times and power of Wireless Network nodes are presented. Indirect RL facilitates planning and learning which ultimately leads to convergence on optimal actions with reduced episodes or time steps compared to direct RL. Three Dyna architecture based algorithms for non deterministic environments...

chapter

Contextual classification with functional Max-Margin Markov Networks

D. Munoz, J.A. Bagnell, N. Vandapel, M. Hebert

2009 IEEE Conference on Computer Vision and Pattern Recognition > 975 - 982

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops)

We address the problem of label assignment in computer vision: given a novel 3D or 2D scene, we wish to assign a unique label to every site (voxel, pixel, superpixel, etc.). To this end, the Markov Random Field framework has proven to be a model of choice as it uses contextual information to yield improved classification results over locally independent classifiers. In this work we adapt a functional...

chapter

Global connectivity potentials for random field models

S. Nowozin, C.H. Lampert

2009 IEEE Conference on Computer Vision and Pattern Recognition > 818 - 825

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops)

Markov random field (MRF, CRF) models are popular in computer vision. However, in order to be computationally tractable they are limited to incorporate only local interactions and cannot model global properties, such as connectedness, which is a potentially useful high-level prior for object segmentation. In this work, we overcome this limitation by deriving a potential function that enforces the...

chapter

Analysis and Design of an Improved R-learning

Wei Chen, Zhenkun Zhai, Xiong Li, Jing Guo

2009 Eighth IEEE/ACIS International Conference on Computer and Information Science > 48 - 52

2009 8th IEEE/ACIS International Conference on Computer and Information Science (ICIS)

This paper presents a modified R-learning according to the traditional average reward reinforcement learning algorithm. Reinforcement learning problems constitute an important class of learning and control problems faced by artificial intelligence systems. The general framework of reinforcement learning can be divided into two forms, discounted reward reinforcement learning and average reward reinforcement...

chapter

Concept extraction using temporal-difference network EUROCON2009

H. Karbasian, M.N. Ahmadabadi, B.N. Araabi

IEEE EUROCON 2009 > 1888 - 1894

IEEE EUROCON 2009

In this paper, we propose a novel framework to extract temporally extended concepts in a grid world environment using a probable data structure named temporal-difference network. First a reinforcement-learning agent tries to learn its environment for the task of wall following. After that we train a newly introduced temporal-difference network (TDN) in the brain of the agent in order to gain a predictive...

chapter

A strategy for converging dynamic action policies

R. Ribeiro, A.P. Borges, A.L. Koerich, E.E. Scalabrin, more

2009 IEEE Symposium on Intelligent Agents > 136 - 143

2009 IEEE Symposium on Intelligent Agents

In this paper we propose a novel strategy for converging dynamic policies generated by adaptive agents, which receive and accumulate rewards for their actions. The goal of the proposed strategy is to speed up the convergence of such agents to a good policy in dynamic environments. Since it is difficult to have the good value for a state due to the continuous changing in the environment, previous policies...

chapter

An action-selection strategy insensitive to parameter-settings in reinforcement learning

K. Ono, K. Iwata, A. Hayashi

2009 ICCAS-SICE > 1012 - 1017

2009 ICROS-SICE International Joint Conference. ICCAS-SICE 2009

Markov decision processes are one of the most popular frameworks for reinforcement learning. The entropy of probability density functions of Markov decision processes is referred to as the stochastic complexity. The stochastic complexity is helpful for tuning the parameters of an action-selection strategy to alleviate the exploration-exploitation dilemma. In this paper, we improve an action-selection...

chapter

Statistical Inferences by Gaussian Markov Random Fields on Complex Networks

K. Tanaka, T. Usui, M. Yasuda

2008 International Conference on Computational Intelligence for Modelling Control&Automation > 214 - 219

2008 International Conference on Computational Intelligence for Modelling Control & Automation (CIMCA 2008)

Gaussian Markov random fields are applied to many statistical inferences. Probabilistic models of statistical inferences are constructed in the concept of Bayesian statistics and have some network structures. In the present paper, we analyze the statistical performance of the statistical inferences in Gaussian Markov random fields on some complex networks including scale free networks. We discuss...

chapter

An Information-Theoretic Class of Stochastic Decision Processes

K. Iwata

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology > 2 > 340 - 344

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology

Stochastic decision processes in reinforcement learning are usually formulated as Markov decision processes which are stationary and ergodic. However, in fact, some of the stochastic decision processes are not necessarily Markov, stationary, and/or ergodic. In this paper, using an information-theoretic property, we show a class of stochastic decision processes in reinforcement learning in which return...

chapter

Adopting animal concepts in hierarchical reinforcement learning and control of intelligent agents

D. Kadlecek, P. Nahodil

2008 2nd IEEE RAS&EMBS International Conference on Biomedical Robotics and Biomechatronics > 924 - 929

2008 2nd IEEE RAS & EMBS International Conference on Biomedical Robotics and Biomechatronics. BioRob 2008

This research integrates rigorous methods of reinforcement learning (RL) and control engineering with a behavioral (ethology) approach to the agent technology. The main outcome is a hybrid architecture for intelligent autonomous agents targeted to the Artificial Life like environments. The architecture adopts several biology concepts and shows that they can provide robust solutions to some areas....

chapter

A learning scheme for stationary probabilities of large markov chains with examples

V.S. Borkar, D.J. Das, A.D. Banik, D. Manjunath

2008 46th Annual Allerton Conference on Communication, Control, and Computing > 1097 - 1099

2008 46th Annual Allerton Conference on Communication, Control, and Computing

We describe a reinforcement learning based scheme to estimate the stationary distribution of subsets of states of large Markov chains. dasiaSplit samplingpsila ensures that the algorithm needs to just encode the state transitions and will not need to know any other property of the Markov chain. (An earlier scheme required knowledge of the column sums of the transition probability matrix.) This algorithm...

chapter

A study on automatic parking for automobiles using Rational Policy Making Method

H. Nakamura, Y. Yafuso, K. Watanabe, H. Igarashi

2008 IEEE Conference on Soft Computing in Industrial Applications > 7 - 12

2008 IEEE Conference on Soft Computing in Industrial Applications. SMCia/08

The reinforcement learning is applied to automatic parking problem for four-wheeled automobile. The automobile controlled by reinforcement learning learns the appropriate steering angle against the outer environment using distance measuring sensors. The Rational Policy Making (PRM) Method is introduced in order to cope with random start positions. The present method has the advantage of easy implementation...

Filter options

Data set:
ieee
Keywords:
PROBABILITY DENSITY FUNCTION
MARKOV PROCESSES
LEARNING (ARTIFICIAL INTELLIGENCE)

Publication date

Set your own date range

Keywords

DATA MINING (18)
REINFORCEMENT LEARNING (12)
LEARNING (7)
DECISION THEORY (5)
MARKOV DECISION PROCESS (5)
ALGORITHM DESIGN AND ANALYSIS (4)
MACHINE LEARNING (3)
MULTI-AGENT SYSTEMS (3)
PROBABILITY (3)
RANDOM PROCESSES (3)
ADAPTATION MODEL (2)
COMPUTER ARCHITECTURE (2)
COMPUTER VISION (2)
CONVERGENCE (2)
DATA MODELS (2)
DELAY (2)
ENTROPY (2)
EQUATIONS (2)
FUNCTION APPROXIMATION (2)
HIDDEN MARKOV MODELS (2)
HIERARCHICAL REINFORCEMENT LEARNING (2)
INFERENCE MECHANISMS (2)
MATHEMATICAL MODEL (2)
MAXIMUM LIKELIHOOD ESTIMATION (2)
MOBILE ROBOTS (2)
TRAJECTORY (2)
802.11 NETWORK (1)
ACTION-OBSERVATION SEQUENCE (1)
ACTION-SELECTION STRATEGY (1)
ACTOR-CRITIC MODEL (1)
ADAPTIVE ACTION SELECTION (1)
ADAPTIVE AGENTS (1)
ADDITIVES (1)
AEROSPACE ELECTRONICS (1)
AGENT BEHAVIOR (1)
AGENT LEARNING (1)
AGENT TECHNOLOGY (1)
ALGEBRAIC GEOMETRY (1)
ALGORITHMS (1)
ANIMAL CONCEPTS (1)
ANIMALS (1)
APPROXIMATE POLICY (1)
APPROXIMATION ALGORITHMS (1)
APPROXIMATION METHODS (1)
ARTIFICIAL INTELLIGENCE (1)
ARTIFICIAL INTELLIGENCE SYSTEM (1)
ARTIFICIAL LIFE (1)
ARTIFICIAL LIFE-LIKE ENVIRONMENTS (1)
ARTIFICIAL NEURAL NETWORKS (1)
AUTOMATIC PARKING PROBLEM (1)
AUTOMOBILES (1)
AUTONOMOUS POWER CONFIGURATION (1)
AUTONOMOUS POWER CONTROL (1)
BALANCE EXPLORATION-EXPLOITATION DILEMMA (1)
BARIUM (1)
BAYES METHODS (1)
BAYES STATISTICS (1)
BAYESIAN STATISTICS (1)
BEHAVIOURAL SCIENCES COMPUTING (1)
BIOLOGICAL SYSTEM MODELING (1)
BIOLOGY (1)
BIOLOGY COMPUTING (1)
BOOSTING (1)
CALL GRAPH (1)
CDHMM (1)
COMBINATORIAL COMPLEXITY (1)
COMBINATORIAL MATHEMATICS (1)
COMPLEX NETWORK (1)
COMPLEX NETWORKS (1)
COMPLEX PLANNING (1)
COMPLEXITY THEORY (1)
COMPUTATIONAL COMPLEXITY (1)
COMPUTATIONAL MODELING (1)
COMPUTER SCIENCE (1)
CONCEPT (1)
CONCEPT EXTRACTION (1)
CONDITIONAL RANDOM FIELD (1)
CONTEXT (1)
CONTEXTUAL CLASSIFICATION (1)
CONTINUOUS DENSITY HMM (1)
CONTROL ENGINEERING (1)
CRF (1)
CUTTING PLANE ALGORITHM (1)
DATA HANDLING (1)
DISCOUNTED REWARD REINFORCEMENT LEARNING (1)
DISTANCE MEASUREMENT (1)
DISTANCE MEASURING SENSOR (1)
DYNA ARCHITECTURE (1)
DYNAMIC ACTION POLICIES (1)
DYNAMIC ENVIRONMENTS (1)
DYNAMICS (1)
EIGENVALUES AND EIGENFUNCTIONS (1)
EMPIRICAL RISK MINIMIZATION ALGORITHM (1)
ERBIUM (1)
ERM (1)
ERM ALGORITHM (1)
ETHOLOGY (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options