The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The paper presents aspects of model-free learning control initialization. Model-free learning has several advantages as general purpose approach or adaptive capability. However, practical implementation is not intuitive in each step. Choice of time scale in order to provide the necessary reactivity or level of granularity of states or control actions is not an easy problem. Before it learns environment...
This paper investigates enhanced Inter-Cell Interference Coordination (eICIC) techniques for Heterogeneous Networks (HetNets), and models this strategic coexistence as a multi-player system in which interference management strategies inspired from a form of reinforcement learning known as distributed Q-learning are devised. Specifically, this paper focuses on time domain eICIC techniques in which...
Researchers have created machines which operate autonomously in complex and changing environments. An important problem that has been widely studied is that of autonomous navigation systems, through which attempts have been made to create mechanisms with their own decision making in complex environments. Ideally, an autonomous navigation agent must have an ability to learn while working in its environment...
Electronic markets are places where entities not known in advance can negotiate and agree upon the exchange of products. Intelligent agents can be proved very advantageous when representing entities in markets. Mostly, such entities are based on reputation models in order to conclude a transaction. However, reputation is not the only parameter that they could be based on. In this work, we deal with...
There exist problems of slow convergence and local optimum in standard Q-learning algorithm. Truncated TD estimate returns efficiency and simulated annealing algorithm increase the chance of exploration. To accelerate the algorithm convergence speed and to avoid results in local optimum, this paper combines Q-learning algorithm, truncated TD estimation and simulated annealing algorithm. We apply improved...
The coexistence of different heterogeneous Radio Access Technologies (RATs) is a significant feature of current wireless networks. Thus, it is important for network elements, such as the Base Stations (BSs) of cellular networks or access points (APs) of wireless local area networks (WLANs) to be reconfigurable according to the real-time network environment. This will enable interconnection between...
A Reinforcement Learning (RL) method applied to the dynamic load allocation in AGC system is presented. The problem can be modeled as a Markov Decision Process (MDP). The Q-learning algorithm as a model-free learning algorithm is introduced. It learns an optimal action strategy by experience from exploring an unknown system and getting rewards. Rewards are chosen to express how well actions control...
Information technology has given e-retailers new capability of learning demand in real time. This paper investigates how to integrate this real time learning technology with Q-learning algorithm for the optimization of dynamic pricing in e-retailing setting. Especially, this paper studies the optimal dynamic pricing problem for seasonal and style products in e-retailing setting, and validate our approach...
In this paper, an analytical comparison is done between dynamic programming and reinforcement learning methods in dynamic two-player games. The emphasis is on the large number of states and actions available for each player and different conflictive optimization objectives of these games that make them complicated in modeling and analysis. Optimization and decision making is done through quantifying...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.