The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Reinforcement learning algorithms that employ neural networks as function approximators have proven to be powerful tools for solving optimal control problems. However, their training and the validation of final policies can be cumbersome as neural networks can suffer from problems like local minima or over fitting. When using iterative methods, such as neural fitted Q-iteration, the problem becomes...
A common drawback of standard reinforcement learning algorithms is their inability to scale-up to real-world problems. For this reason, a current important trend of research is (state-action) value function approximation. A prominent value function approximator is the least-squares temporal differences (LSTD) algorithm. However, for technical reasons, linearity is mandatory: the parameterization of...
Two popular hazards in supervised learning of neural networks are local minima and over fitting. Application of the momentum technique dealing with the local optima has proved efficient but it is vulnerable to over fitting. In contrast, deployment of the early stopping technique might overcome the over fitting phenomena but it sometimes terminates into the local minima. This paper proposes a hybrid...
This paper considers on-line training of feadforward neural networks. Training examples are only available sampled randomly from a given generator. What emerges in this setting is the problem of step-sizes, or learning rates, adaptation. A scheme of determining step-sizes is introduced here that satisfies the following requirements: (i) it does not need any auxiliary problem-dependent parameters,...
A range of different value systems have been proposed for self-motivated agents, including biologically and cognitively inspired approaches. Likewise, these value systems have been integrated with different behavioral systems including reflexive architectures, reward-based learning and supervised learning. However, there is little literature comparing the performance of different value systems for...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.