Approximate Policy Iteration for Semi-Markov Control Revisited

Abhijit Gosavi

doi:10.1016/j.procs.2011.08.046

Approximate Policy Iteration for Semi-Markov Control Revisited

Abhijit Gosavi

Source

Procedia Computer Science > 2011 > 6 > Complete > 249-255

Abstract

The semi-Markov decision process can be solved via reinforcement learning without generating its transition model. We briefly review the existing algorithms based on approximate policy iteration (API) for solving this problem for discounted and average reward under the infinite horizon. API techniques have attracted significant interest in the literature recently. We first present and analyze an extension of an existing API algorithm for discounted reward that can handle continuous reward rates. Then, we also consider its average reward counterpart, which requires an updating based on the stochastic shortest path (SSP). We study the convergence properties of the algorithm that does not require the SSP update.

Identifiers

journal ISSN :	1877-0509
DOI	10.1016/j.procs.2011.08.046

Authors

Abhijit Gosavi

Keywords

approximate policy iteration reinforcement learning average reward Semi-Markov

Additional information

Publication languages: English

Data set: Elsevier

Publisher

Elsevier Science

Fields of science

No field of science has been suggested yet.

article

Read online
Download
Add to read later
Add to collection
Add to followed
Share

Export to bibliography


Assign to other user
	×
Wrong email address

INFONA - science communication portal

Approximate Policy Iteration for Semi-Markov Control Revisited $("#expandableTitles").expandable();

Source

Abstract

Identifiers

Authors

User assignment

Assignment remove confirmation

You're going to remove this assignment. Are you sure?

Abhijit Gosavi

Keywords

Additional information

Publisher

Fields of science

Fields of science

Share

Export to bibliography

Reporting an error / abuse

Sending the report failed

Accessibility options

Approximate Policy Iteration for Semi-Markov Control Revisited