# Mittagsseminar (in cooperation with M. Ghaffari, A. Steger and B. Sudakov)

__Mittagsseminar Talk Information__ | |

**Date and Time**: Thursday, December 19, 2013, 12:15 pm

**Duration**: 30 minutes

**Location**: CAB G51

**Speaker**: Hemant Tyagi

## The adversarial multi-armed bandit problem

In this talk, we will look at some results from a paper by Auer et al. (SIAM J. Comput.'02) on the adversarial version of the classical multi-armed bandit problem. In this problem version, a player is given access to a set of K strategies (or arms). At each round t=1,2,.. the player chooses a strategy and receives a corresponding reward for it. The rewards are assumed to be assigned to the strategies by an adversary arbitrarily. The goal of the player is to play the strategies so as to minimize the cumulative expected regret i.e. the difference between the total expected reward of the best constant sequence of strategies and the total expected reward obtained by the player over T rounds. Assuming the adversary to be oblivious to the actions of the player we will first see a randomized algorithm that achieves a regret bound of O((K T log K)^{1/2}) against all oblivious adversaries. We will then look at the construction of a hard adversary against which any algorithm incurs a regret of \Omega((KT)^{1/2}).

Upcoming talks | All previous talks | Talks by speaker | Upcoming talks in iCal format (beta version!)

Previous talks by year: 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996

Information for students and suggested topics for student talks

Automatic MiSe System Software Version 1.4803M | admin login