Preface 1 Introduction 2 Prediction with expert advice 2.1 Weighted average prediction 2.2 An optimal bound 2.3 Bounds that hold uniformly over time 2.4 An improvement for small losses 2.5 Forecasters using the gradient of the loss 2.6 Scaled losses and signed games 2.7 The multilinear forecaster 2.8 The exponential forecaster for signed games 2.9 Simulatable experts 2.10 Minimax regret 2.11 Discounted regret 2.12 Bibliographic remarks 2.13 Exercises 3 Tight bounds for specific losses 3.1 Introduction 3.2 Follow the best expert 3.3 Exp-concave loss functions 3.4 The greedy forecaster 3.5 The aggregating forecaster 3.6 Mixability for certain losses 3.7 General lower bounds 3.8 Bibliographic remarks 3.9 Exercises 4 Randomized prediction 4.1 Introduction 4.2 Weighted average forecasters 4.3 Follow the perturbed leader 4.4 Internal regret 4.5 Calibration 4.6 Generalized regret 4.7 Calibration with checking rules 4.8 Bibliographic remarks 4.9 Exercises 5 Efficient forecasters for large classes of experts 5.1 Introduction 5.2 Tracking the best expert 5.3 Tree experts 5.4 The shortest path problem 5.5 Tracking the best of many actions 5.6 Bibliographic remarks 5.7 Exercises 6 Prediction with limited feedback 6.1 Introduction 6.2 Label efficient prediction 6.3 Lower bounds 6.4 Partial monitoring 6.5 A general forecaster for partial monitoring 6.6 Hannan consistency and partial monitoring 6.7 Multi-armed bandit problems 6.8 An improved bandit strategy 6.9 Lower bounds for the bandit problem 6.10 How to select the best action 6.11 Bibliographic remarks 6.12 Exercises 7 Prediction and playing games 7.1 Games and equilibria 7.2 Minimax theorems 7.3 Repeated two-player zero-sum games 7.4 Correlated equilibrium and internal regret 7.5 Unknown games: game-theoretic bandits 7.6 Calibration and correlated equilibrium 7.7 Blackwell's approachability theorem 7.8 Potential-based approachability 7.9 Convergence to Nash equilibria 7.10 Convergence in unknown games 7.11 Playing against opponents that react 7.12 Bibliographic remarks 7.13 Exercises 8 Absolute loss 8.1 Simulatable experts 8.2 Optimal algorithm for simulatable experts 8.3 Static experts 8.4 A simple example 8.5 Bounds for classes of static experts 8.6 Bounds for general classes 8.7 Bibliographic remarks 8.8 Exercises 9 Logarithmic loss 9.1 Sequential probability assignment 9.2 Mixture forecasters 9.3 Gambling and data compression 9.4 The minimax optimal forecaster 9.5 Examples 9.6 The Laplace mixture 9.7 A refined mixture forecaster 9.8 Lower bounds for most sequences 9.9 Prediction with side information 9.10 A general upper bound 9.11 Further examples 9.12 Bibliographic remarks 9.13 Exercises 10 Sequential investment 10.1 Portfolio selection 10.2 The minimax wealth ratio 10.3 Prediction and investment 10.4 Universal portfolios 10.5 The EG investment strategy 10.6 Investment with side information 10.7 Bibliographic remarks 10.8 Exercises 11 Linear pattern recognition 11.1 Prediction with side information 11.2 Bregman divergences 11.3 Potential-based gradient descent 11.4 The transfer function 11.5 Forecasters using Bregman projections 11.6 Time-varying potentials 11.7 The elliptic potential 11.8 A nonlinear forecaster 11.9 Lower bounds 11.10 Mixture forecasters 11.11 Bibliographic remarks 11.12 Exercises 12 Linear classification 12.1 The zero-one loss 12.2 The hinge loss 12.3 Maximum margin classifiers 12.4 Label efficient classifiers 12.5 Kernel-based classifiers 12.6 Bibliographic remarks 12.7 Exercises 13 Appendix 13.1 Inequalities from probability theory 13.1.1 Hoeffding's inequality 13.1.2 Bernstein's inequality 13.1.3 Hoeffding-Azuma inequality and related results 13.1.4 Khinchine's inequality 13.1.5 Slud's inequality 13.1.6 A simple limit theorem 13.1.7 Proof of Theorem 8.3 13.1.8 Rademacher averages 13.1.9 The Beta distribution 13.2 Basic information theory 13.3 Basics of classification Bibliography Author index