Advanced search

Advanced search in people

From:

To:

Items from 1 to 1 out of 1 results

chapter

Q-learning and enhanced policy iteration in discounted dynamic programming

D P Bertsekas, Huizhen Yu

49th IEEE Conference on Decision and Control (CDC) > 1409 - 1416

2010 49th IEEE Conference on Decision and Control (CDC 2010)

We consider the classical finite-state discounted Markovian decision problem, and we introduce a new policy iteration-like algorithm for finding the optimal Q-factors. Instead of policy evaluation by solving a linear system of equations, our algorithm involves (possibly inexact) solution of an optimal stopping problem. This problem can be solved with simple Q-learning iterations, in the case where...

Filter options

Keywords:
CONVERGENCE
MARKOV PROCESSES
FUNCTION APPROXIMATION
ITERATIVE METHODS
STOCHASTIC ITERATIVE IMPLEMENTATIONS

Publication date

Set your own date range

INFONA - science communication portal

Advanced search

Advanced search in people

Q-learning and enhanced policy iteration in discounted dynamic programming

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Reporting an error / abuse

Sending the report failed

Accessibility options