Search results for: Kenji Doya

Items from 1 to 4 out of 4 results

chapter

Policy gradient reinforcement learning method for discrete-time linear quadratic regulation problem using estimated state value function

Tomotake Sasaki, Eiji Uchibe, Hidenao Iwane, Hitoshi Yanami, more

2017 56th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE) > 653 - 657

2017 56th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE)

In this paper, we propose a policy gradient reinforcement learning method which directly estimates the gradient of the state value function (V-function) with respect to a feedback coefficient matrix using measurable data and uses it for policy improvement. The proposed method can be applicable to the case where the state-action value function (Q-function) is difficult to estimate, and can update the...

chapter

Self-consistent neuronal population under spike inputs and unbalanced conditions

Carlos E. Gutierrez, Kenji Doya, Junichiro Yoshimoto

2015 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS) > 309 - 312

2015 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS)

A single neuron gain function can predict the population activity of homogeneous neurons under strong limitations, such as the stationary state and balanced conditions of the total input. In this work, we propose a modification to the self-consistency model when balanced conditions are not fully satisfied. We present a scaling factor to modify the excitatory weights in a Brunel network. It allows...

chapter

Inverse reinforcement learning using Dynamic Policy Programming

Eiji Uchibe, Kenji Doya

4th International Conference on Development and Learning and on Epigenetic Robotics > 222 - 228

2014 Joint IEEE International Conferences on Development and Learning and Epigenetic Robotics (ICDL-Epirob)

This paper proposes a novel model-free inverse reinforcement learning method based on density ratio estimation under the framework of Dynamic Policy Programming. We show that the logarithm of the ratio between the optimal policy and the baseline policy is represented by the state-dependent cost and the value function. Our proposal is to use density ratio estimation methods to estimate the density...

chapter

Combining learned controllers to achieve new goals based on linearly solvable MDPs

Eiji Uchibe, Kenji Doya

2014 IEEE International Conference on Robotics and Automation (ICRA) > 5252 - 5259

2014 IEEE International Conference on Robotics and Automation (ICRA)

Learning complicated behaviors usually involves intensive manual tuning and expensive computational optimization because we have to solve a nonlinear Hamilton-Jacobi-Bellman (HJB) equation. Recently, Todorov proposed a class of the so-called Linearly solvable Markov Decision Process (LMDP) which converts a nonlinear HJB equation to a linear differential equation. Linearity of the simplified HJB equation...

INFONA - science communication portal

Search results for: Kenji Doya

Policy gradient reinforcement learning method for discrete-time linear quadratic regulation problem using estimated state value function

Self-consistent neuronal population under spike inputs and unbalanced conditions

Inverse reinforcement learning using Dynamic Policy Programming

Combining learned controllers to achieve new goals based on linearly solvable MDPs

Filter options

Publication date

Keywords

INFONA - science communication portal

Search results for: Kenji Doya

Policy gradient reinforcement learning method for discrete-time linear quadratic regulation problem using estimated state value function

Self-consistent neuronal population under spike inputs and unbalanced conditions

Inverse reinforcement learning using Dynamic Policy Programming

Combining learned controllers to achieve new goals based on linearly solvable MDPs

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options