Search results for: Shalabh Bhatnagar

Items from 1 to 2 out of 2 results

article

Multiscale Q-learning with linear function approximation

Shalabh Bhatnagar, K. Lakshmanan

Discrete Event Dynamic Systems > 2016 > 26 > 3 > 477-509

We present in this article a two-timescale variant of Q-learning with linear function approximation. Both Q-values and policies are assumed to be parameterized with the policy parameter updated on a faster timescale as compared to the Q-value parameter. This timescale separation is seen to result in significantly improved numerical performance of the proposed algorithm over Q-learning. We show that...

chapter

A novel Q-learning algorithm with function approximation for constrained Markov decision processes

K. Lakshmanan, Shalabh Bhatnagar

2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton) > 400 - 405

2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton)

We present a novel multi-timescale Q-learning algorithm for average cost control in a Markov decision process subject to multiple inequality constraints. We formulate a relaxed version of this problem through the Lagrange multiplier method. Our algorithm is different from Q-learning in that it updates two parameters — a Q-value parameter and a policy parameter. The Q-value parameter is updated on...

Filter options

Keywords:
MULTI-STAGE STOCHASTIC SHORTEST PATH PROBLEM

Publication date

Set your own date range

Publication type

article (1)
book (1)

Keywords

Data set

ieee (1)
Springer (1)

INFONA - science communication portal

Search results for: Shalabh Bhatnagar

Multiscale Q-learning with linear function approximation

A novel Q-learning algorithm with function approximation for constrained Markov decision processes

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Data set

Reporting an error / abuse

Sending the report failed

Accessibility options