The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We consider the setting of stochastic bandit problems with a continuum of arms indexed by [0,1]d. We first point out that the strategies considered so far in the literature only provided theoretical guarantees of the form: given some tuning parameters, the regret is small with respect to a class of environments that depends on these parameters. This is however not the right...
This paper studies the deviations of the regret in a stochastic multi-armed bandit problem. When the total number of plays n is known beforehand by the agent, Audibert et al. (2009) exhibit a policy such that with probability at least 1-1/n, the regret of the policy is of order logn. They have also shown that such a property is not shared by the popular ucb1 policy of Auer et al. (2002). This work...
Many problems, such as cognitive radio, parameter control of a scanning tunnelling microscope or internet advertisement, can be modelled as non-stationary bandit problems where the distributions of rewards changes abruptly at unknown time instants. In this paper, we analyze two algorithms designed for solving this issue: discounted UCB (D-UCB) and sliding-window UCB (SW-UCB). We establish an upper-bound...
In this paper, we study the problem of estimating the mean values of all the arms uniformly well in the multi-armed bandit setting. If the variances of the arms were known, one could design an optimal sampling strategy by pulling the arms proportionally to their variances. However, since the distributions are not known in advance, we need to design adaptive sampling strategies to select an arm at...
The ALT-conference series is focuses on studies of learning from an algorithmic and mathematical perspective. During the last decades various models of learning emerged and a main goal is to investigate how various learning problems can be formulated and solved in some of the abstract models.
Our ALT’2010 paper claimed that every computably finitely thick [LZ96, Definition 9] class of languages can be identified by enumeration operator [MZ10, Definition 1(e) and Theorem 12]. However, this is, in fact, false. We intend to include a proof of the claim’s negation in the journal version of our paper, which has been submitted.
We analyze iterative learning in the limit from positive data with the additional information provided by a counter. The simplest type of counter provides the current iteration number (counting up from 0 to infinity), which is known to improve learning power over plain iterative learning. We introduce five other (weaker) counter types, for example only providing some unbounded and non-decreasing...
This paper adapts and investigates the paradigm of robust learning, originally defined in the inductive inference literature for classes of recursive functions, to learning languages from positive data. Robustness is a very desirable property, as it captures a form of invariance of learnability under admissible transformations on the object of study. The classes of languages of interest are automatic...
We define and study a learning paradigm that sits between identification in the limit and classification. More precisely, we expect that a learner be able to identify in the limit which members of a set D of n possible data belong to a target language, where n and D are arbitrary. We show that Ex- and BC-learning are often more difficult than performing this classification task, taking into account...
Patterns provide a simple, yet powerful means of describing formal languages. However, for many applications, neither patterns nor their generalized versions of typed patterns are expressive enough. This paper extends the model of (typed) patterns by allowing relations between the variables in a pattern. The resulting formal languages are called Relational Pattern Languages (RPLs). We study the problem...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.