The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we focus on the basic form of autonomous follow driving problem with one leader and one follower. A reinforcement learning based throttle and brake control approach is developed for the follower vehicle. Near optimal control law is directly learned by “trial and error” with the neural dynamic programming algorithm. According to the timely updated following state, the learned control...
Routing in dynamically changing node location scenarios is quite challenging and time consuming. The emerging wireless communication networks such as LTE advanced and 5G, device-to-device communications present such dynamically changing node locations. In mobile ad hoc networks, very often we come across such dynamically changing node location scenarios. In the Internet of things (IoTs), we will come...
We consider a modular method to reinforcement learning that represents uncertainty of model parameters by maintaining probability distributions over them. The algorithm we call MBDP (model-based Bayesian dynamic programming) can be decomposed into two parallel types of inference: model learning and policy learning. During learning a model, we update posterior distributions of a model over observations...
Using reinforcement learning (RL), this paper deals with the problem of call admission control (CAC) and routing in differentiating the services of Wavelength Division Multiplexing (WDM) networks to obtain maximized system revenue. The problem is formulated as a finite-state discrete-time dynamic programming problem. Here we adopt the RL method together with a decomposition approach, to solve this...
This paper proposes a new algorithm that employs Adaptive Dynamic Programming(ADP) to solve the distributed control problem of urban traffic with an infinite horizon. Urban traffic congestions lead to a lot of time consumption and exhaust emissions. So alleviating congested situation will have a good impact on both economy and environment. The signal control at urban intersections is an effective...
Research in reinforcement learning has produced algorithms for optimal decision making under uncertainty that fall within two main types. The first employs a Bayesian framework, where optimality improves with increased computational time. This is because the resulting planning task takes the form of a dynamic programming problem on a be-lief tree with an infinite number of states. The second type...
Task decomposition and state abstraction are crucial parts in reinforcement learning. It allows an agent to ignore aspects of its current states that are irrelevant to its current decision, and therefore speeds up dynamic programming and learning. This paper presents the SVI algorithm that uses a dynamic Bayesian network model to construct an influence graph that indicates relationships between state...
Storage virtualization provides abstraction to pervasive storage graphs. Data management operations require pervasive search utilities to discovery entities and services. In search methods, algorithmic complexity, such as memory bound problems, error convergence issues and supervised training is prohibitive for large state and solution spaces or high dimensional state spaces. In addition, among popular...
In this paper, an analytical comparison is done between dynamic programming and reinforcement learning methods in dynamic two-player games. The emphasis is on the large number of states and actions available for each player and different conflictive optimization objectives of these games that make them complicated in modeling and analysis. Optimization and decision making is done through quantifying...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.