Wyniki wyszukiwania dla: Jose C. Principe

Pozycje od 1 do 3 spośród 3 wyników

rozdział

Balancing exploration and exploitation in reinforcement learning using a value of information criterion

Isaac J. Sledge, Jose C. Principe

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2816 - 2820

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper, we consider an information-theoretic approach for addressing the exploration-exploitation dilemma in reinforcement learning. We employ the value of information, a criterion that provides the optimal trade-off between the expected returns and a policy's degrees of freedom. As the degrees of freedom are reduced, an agent will exploit more than explore. As the policy degrees of freedom...

artykuł

Reinforcement Learning in Video Games Using Nearest Neighbor Interpolation and Metric Learning

Matthew S. Emigh, Evan G. Kriminger, Austin J. Brockmeier, Jose C. Principe, więcej

IEEE Transactions on Computational Intelligence and AI in Games > 2016 > 8 > 1 > 56 - 66

Reinforcement learning (RL) has had mixed success when applied to games. Large state spaces and the curse of dimensionality have limited the ability for RL techniques to learn to play complex games in a reasonable length of time. We discuss a modification of Q-learning to use nearest neighbor states to exploit previous experience in the early stages of learning. A weighting on the state features is...

rozdział

A model based approach to exploration of continuous-state MDPs using Divergence-to-Go

Matthew Emigh, Evan Kriminger, Jose C. Principe

2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP) > 1 - 6

2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP)

In reinforcement learning, exploration is typically conducted by taking occasional random actions. The literature lacks an exploration method driven by uncertainty, in which exploratory actions explicitly seek to improve the learning process in a sequential decision problem. In this paper, we propose a framework called Divergence-to-Go, which is a model-based method that uses recursion similarly to...

Opcje filtrowania

Słowa kluczowe:
MARKOV PROCESSES

Data publikacji

Ustaw własny zakres dat

Typ publikacji

książka (2)
artykuł (1)

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania dla: Jose C. Principe

Balancing exploration and exploitation in reinforcement learning using a value of information criterion

Reinforcement Learning in Video Games Using Nearest Neighbor Interpolation and Metric Learning

A model based approach to exploration of continuous-state MDPs using Divergence-to-Go

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Typ publikacji

Słowa kluczowe

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu