STATS 710: Sequential Decision Making with mHealth Applications

Fall 2016

Class Information

Instructor Information

Grading

Topics

Schedule

Readings

Projects

Class Information

Instructor Information

Name: Susan Murphy

Office: 445D West Hall

Office Hours: By appointment

Email: samurphy@umich.edu

Name: Ambuj Tewari

Office: 454 West Hall

Office Hours: By appointment

Email: tewaria@umich.edu

Grading

This is an advanced graduate level course. Our expectation is that you will engage in the course out of intellectual curiosity. The formal course requirements will be rather minimal. The final grade in the course will be determined by the quality of your scribed lecture notes and one final project using the weights given below.

Topics

A tentative list of topics is as follows.

Schedule

Lecture number

Day | Notes

Topics

1

Sep 06 | lec01.pdf

  • Course logistics
  • Introduction to Multi-armed bandits
  • Basic consistency result from Robbins (1952)

2

Sep 08 | lec02.pdf

  • Epsilon greedy and its regret analysis from Auer et al. (2002)

3

Sep 13 | lec03.pdf

  • Non-standard time 2:30-4
  • Finish regret analysis of epsilon-greedy
  • UCB and its regret analysis from Auer et al. (2002)

4

Sep 15 | lec04.pdf

  • Non-standard time 2:30-4
  • Finish regret analysis of UCB

5

Sep 20 | lec05.pdf

  • From gap based distribution specific to distribution independent regret bounds
  • Distribution independent lower bound (informal version)
  • Thompson sampling for Bernoulli rewards with Beta priors

6

Sep 22 | lec06.pdf

  • Regret analysis of Thompson Sampling from Agrawal and Goyal (2013)

7

Sep 27 | lec07.pdf

  • Finish regret analysis of Thompson Sampling

8

Sep 29 | lec08.pdf

  • Non-stochastic bandit problem and the Exp3 algorithm from Auer et al. (2002)

Oct 04

  • CLASS CANCELLED
  • Initial project proposals due

9

Oct 06 | lec09.pdf

  • Finish regret analysis of Exp3
  • Introduction to contextual bandits

10

Oct 11 | lec10.pdf

  • In old location: 4246 Randall
  • Competing with a class of policies
  • Reduction to multi-armed bandits (finite context space)

11

Oct 13 | lec11.pdf

  • Epoch Greedy algorithm from Langford and Zhang (2008)

12

Oct 18 | lec12.pdf

  • Exp4 algorithm from Auer et al. (2002)
  • Final project proposals due

13

Oct 20 | lec13.pdf

  • VE algorithm and regret analysis from Beygelzimer et al. (2011)

Oct 25

  • NO CLASS

Oct 27

  • NO CLASS

14

Nov 01 | lec14.pdf

  • Lecture by Susan
  • Introduction to RL; transitioning from contextual bandits

15

Nov 03 | lec15.pdf

  • Lecture by Susan
  • UCB for finite horizon RL from Auer & Ortner (2005)

16

Nov 08 | lec16.pdf

  • In old location: 4246 Randall
  • ILOVETOCONBANDITS from Agarwal et al. (2014)

17

Nov 10 | lec17.pdf

  • Lecture by Susan
  • UCB for finite horizon RL from Auer & Ortner (2005)
  • Susan’s notes

18

Nov 15 | Slides from Elad’s talk

19

Nov 17 | lec19.pdf

  • PAC bounds for multi-armed bandits from Even-Dar et al. (2002)

20

Nov 22 | lec20.pdf

  • Finish analysis of Median Elimination from Even-Dar et al. (2002)

Nov 24

  • THANKSGIVING BREAK

21

Nov 29 | lec21.pdf

  • Online ranking with top-1 feedback from Chaudhuri and Tewari (2015)
  • An instance of a partial monitoring game with a combinatorially large action space

22

Dec 01 | lec22.pdf

  • Online ranking with top-1 feedback (continued)
  • Non-standard time 4-5

23

Dec 06 | lec23.pdf

24

Dec 08 | lec24.pdf

  • Guest lecture by Nan Jiang
  • Contextual decision processes with low Bellman rank

Dec 13

  • Location:
    East Conference Room
    4th floor
    Rackham Building
  • Poster Setup: 3-3:30
  • Poster Presentations: 3:30-5:00

Dec 14

  • Project reports due

additional01.pdf

  • Note: A Reinforcement Learning System to Encourage Physical Activity in Diabetes Patients

additional02.pdf

  • Note: Risk-Averse Bandits

additional03.pdf

  • Note: Bandits with heavy tailed distributions

Readings

Books and Monographs

Bandit Problems

Partial Monitoring

Contextual Bandit Problems

Learning in MDPs

                                        

Overviews of mHealth

Projects