The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We study risk-averse multi-armed bandit problems under different risk measures. We consider three risk mitigation models. In the first model, the variations in the reward values obtained at different times are considered as risk and the objective is to minimize the mean-variance of the observed rewards. In the second and the third models, the quantity of interest is the total reward at the end of...
We consider the shortest path problem in a communication network with random link costs drawn from unknown distributions. A realization of the total end-to-end cost is obtained when a path is selected for communication. The objective is an online learning algorithm that minimizes the total expected communication cost in the long run. The problem is formulated as a multi-armed bandit problem with dependent...
The Minimum Connected Dominating Set (MCDS) problem is to find a subset of vertices in a given graph G such that the set is connected and any vertex of G is either in the set or adjacent to a node in the set. This problem is shown to be NP-Hard and the best polynomial time approximation ratio is O(log n) where n is the number of vertices. The MCDS problem and its derivations are of interest in many...
In the classic Multi-Armed Bandit (MAB) problem, there is a given set of arms with unknown reward distributions. At each time, a player selects one arm to play, aiming to maximize the total expected reward over a horizon of length T. It is known that the minimum growth rate of regret (defined as the total expected loss with respect to the ideal scenario of known reward models of all arms) is logarithmic...
We consider the stochastic online linear optimization problems under unknown cost models. At each time, an action is chosen from a compact subset in Rd and a random cost with an unknown distribution (depending on the action) is incurred. The expected value of the random cost is assumed to be a (unknown) linear function over the action space. The objective is to minimize the growth rate of regret (i...
In this paper, we consider a risk model in which the individual claim amount is assumed to be a random variable with fuzzy parameters and the claim number process is characterized as Poisson process with fuzzy intensity λ. The mean chance of the ultimate ruin is researched. Particularly, the expressions of the mean chance of the ultimate ruin are obtained for zero initial surplus and arbitrary initial...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.