## Theory of Combinatorial Algorithms

Prof. Emo Welzl and Prof. Bernd Gärtner

# Mittagsseminar (in cooperation with M. Ghaffari, A. Steger and B. Sudakov)

 Mittagsseminar Talk Information

Date and Time: Thursday, December 19, 2013, 12:15 pm

Duration: 30 minutes

Location: CAB G51

Speaker: Hemant Tyagi

## The adversarial multi-armed bandit problem

In this talk, we will look at some results from a paper by Auer et al. (SIAM J. Comput.'02) on the adversarial version of the classical multi-armed bandit problem. In this problem version, a player is given access to a set of K strategies (or arms). At each round t=1,2,.. the player chooses a strategy and receives a corresponding reward for it. The rewards are assumed to be assigned to the strategies by an adversary arbitrarily. The goal of the player is to play the strategies so as to minimize the cumulative expected regret i.e. the difference between the total expected reward of the best constant sequence of strategies and the total expected reward obtained by the player over T rounds. Assuming the adversary to be oblivious to the actions of the player we will first see a randomized algorithm that achieves a regret bound of O((K T log K)1/2) against all oblivious adversaries. We will then look at the construction of a hard adversary against which any algorithm incurs a regret of \Omega((KT)1/2).

Previous talks by year:   2018  2017  2016  2015  2014  2013  2012  2011  2010  2009  2008  2007  2006  2005  2004  2003  2002  2001  2000  1999  1998  1997  1996

Information for students and suggested topics for student talks