Wyniki wyszukiwania dla: Shalabh Bhatnagar

Pozycje od 1 do 2 spośród 2 wyników

artykuł

An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method

Ajin George Joseph, Shalabh Bhatnagar

Machine Learning > 2018 > 107 > 8-10 > 1385-1429

In this paper, we provide two new stable online algorithms for the problem of prediction in reinforcement learning, i.e., estimating the value function of a model-free Markov reward process using the linear function approximation architecture and with memory and computation costs scaling quadratically in the size of the feature set. The algorithms employ the multi-timescale stochastic approximation...

artykuł

An incremental off-policy search in a model-free Markov decision process using a single sample path

Ajin George Joseph, Shalabh Bhatnagar

Machine Learning > 2018 > 107 > 6 > 969-1011

In this paper, we consider a modified version of the control problem in a model free Markov decision process (MDP) setting with large state and action spaces. The control problem most commonly addressed in the contemporary literature is to find an optimal policy which maximizes the value function, i.e., the long run discounted reward of the MDP. The current settings also assume access to a generative...

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania dla: Shalabh Bhatnagar

An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method

An incremental off-policy search in a model-free Markov decision process using a single sample path

Opcje filtrowania

Data publikacji

Słowa kluczowe

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania dla: Shalabh Bhatnagar

An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method

An incremental off-policy search in a model-free Markov decision process using a single sample path

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Słowa kluczowe

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu