Walras-Bowley Lecture 2003 (extended)

Similar documents
Nash Equilibrium and Dynamics

Nash Equilibrium and Dynamics

Equilibrium Selection In Coordination Games

Learning and Equilibrium

Lecture 2: Learning and Equilibrium Extensive-Form Games

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

Behavioral Game Theory

Evolving Strategic Behaviors through Competitive Interaction in the Large

Perfect Bayesian Equilibrium

LEARNING AND EQUILIBRIUM, II: EXTENSIVE-FORM GAMES. Simons Institute Economics and Computation Boot Camp UC Berkeley,August 2015 Drew Fudenberg

Mini-Course in Behavioral Economics Leeat Yariv. Behavioral Economics - Course Outline

Game Theory. Uncertainty about Payoffs: Bayesian Games. Manar Mohaisen Department of EEC Engineering

Testing models with models: The case of game theory. Kevin J.S. Zollman

Epistemic Game Theory

Pareto-optimal Nash equilibrium detection using an evolutionary approach

Lecture 9: The Agent Form of Bayesian Games

Generative Adversarial Networks.

A Framework for Sequential Planning in Multi-Agent Settings

Game theory for playing games: sophistication in a negative-externality experiment

Conformity and stereotyping in social groups

Reinforcement Learning : Theory and Practice - Programming Assignment 1

Tilburg University. Game theory van Damme, Eric. Published in: International Encyclopedia of the Social & Behavioral Sciences

Clicker quiz: Should the cocaine trade be legalized? (either answer will tell us if you are here or not) 1. yes 2. no

The Keynesian Beauty Contest: Economic System, Mind, and Brain

behavioural economics and game theory

Learning, Signaling and Social Preferences in Public Good Games

The Common Priors Assumption: A comment on Bargaining and the Nature of War

Behavioral Game Theory

Behavioral Game Theory

Thoughts on Social Design

CORRELATED EQUILIBRIA, GOOD AND BAD: AN EXPERIMENTAL STUDY. University of Pittsburgh, U.S.A.; University of Aberdeen Business School, U.K.

Games With Incomplete Information: Bayesian Nash Equilibrium

Theoretical Explanations of Treatment Effects in Voluntary Contributions Experiments

Today s lecture. A thought experiment. Topic 3: Social preferences and fairness. Overview readings: Fehr and Fischbacher (2002) Sobel (2005)

Selecting Strategies Using Empirical Game Models: An Experimental Analysis of Meta-Strategies

Game theory and epidemiology

Topic 3: Social preferences and fairness

The Role of Implicit Motives in Strategic Decision-Making: Computational Models of Motivated Learning and the Evolution of Motivated Agents

An Experiment to Evaluate Bayesian Learning of Nash Equilibrium Play

Homo economicus is dead! How do we know how the mind works? How the mind works

SUPPLEMENTARY INFORMATION

How rational are your decisions? Neuroeconomics

Effects of Sequential Context on Judgments and Decisions in the Prisoner s Dilemma Game

What Happens in the Field Stays in the Field: Professionals Do Not Play Minimax in Laboratory Experiments

Perfect Bayesian Equilibrium

A Game Theoretical Approach for Hospital Stockpile in Preparation for Pandemics

On the Evolution of Choice Principles

Introduction to Game Theory Autonomous Agents and MultiAgent Systems 2015/2016

Eliciting Beliefs by Paying in Chance

A short introduction to game theory

Abhimanyu Khan, Ronald Peeters. Cognitive hierarchies in adaptive play RM/12/007

The Game Prisoners Really Play: Preference Elicitation and the Impact of Communication

Operant Matching as a Nash Equilibrium of an Intertemporal Game

Volume 30, Issue 3. Boundary and interior equilibria: what drives convergence in a beauty contest'?

Publicly available solutions for AN INTRODUCTION TO GAME THEORY

Some Thoughts on the Principle of Revealed Preference 1

Morality, policy and the brain

Learning Conditional Behavior in Similar Stag Hunt Games

Cognitive Ability in a Team-Play Beauty Contest Game: An Experiment Proposal THESIS

ULTIMATUM GAME. An Empirical Evidence. Presented By: SHAHID RAZZAQUE

Irrationality in Game Theory

Information Acquisition in the Iterated Prisoner s Dilemma Game: An Eye-Tracking Study

Jakub Steiner The University of Edinburgh. Abstract

Information Design, Bayesian Persuasion, and Bayes Correlated Equilibrium

Uncertainty about Rationality.

Predictive Game Theory* Drew Fudenberg Harvard University

An Economist's Perspective on Probability Matching

Rational Learning and Information Sampling: On the Naivety Assumption in Sampling Explanations of Judgment Biases

Open-minded imitation in vaccination games and heuristic algorithms

Towards a Justification the Principle of Coordination

Two-sided Bandits and the Dating Market

1 Invited Symposium Lecture at the Econometric Society Seventh World Congress, Tokyo, 1995, reprinted from David

Evolutionary game theory and cognition

Ambiguity and the Bayesian Approach

Essays on Behavioral and Experimental Game Theory. Ye Jin. A dissertation submitted in partial satisfaction of the. requirements for the degree of

Behavioral Economics - Syllabus

arxiv: v1 [cs.gt] 14 Mar 2014

Identity, Homophily and In-Group Bias: Experimental Evidence

The Power of Sunspots: An Experimental Analysis

Full terms and conditions of use:

OPTIMIZATION INCENTIVES AND COORDINATION FAILURE IN LABORATORY STAG HUNT GAMES

Working paper n

CONVERGENCE: AN EXPERIMENTAL STUDY OF TEACHING AND LEARNING IN REPEATED GAMES

Published in: Proceedings of the National Academy of Sciences of the United States of America

Signalling, shame and silence in social learning. Arun Chandrasekhar, Benjamin Golub, He Yang Presented by: Helena, Jasmin, Matt and Eszter

Unlike standard economics, BE is (most of the time) not based on rst principles, but on observed behavior of real people.

Inductive Cognitive Models and the Coevolution of Signaling

MTAT Bayesian Networks. Introductory Lecture. Sven Laur University of Tartu

VOLKSWIRTSCHAFTLICHE ABTEILUNG. Reputations and Fairness in Bargaining Experimental Evidence from a Repeated Ultimatum Game with Fixed Opponents

Strong Reciprocity and Human Sociality

Bounded Rationality, Taxation, and Prohibition 1. Suren Basov 2 and Svetlana Danilkina 3

Supporting Information

A Case for Behavioural Game Theory

1 Language. 2 Title. 3 Lecturer. 4 Date and Location. 5 Course Description. Discipline: Methods Course. English

The Foundations of Behavioral. Economic Analysis SANJIT DHAMI

Dancing at Gunpoint: A Review of Herbert Gintis Bounds of Reason

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Learning and Transfer: An Experimental Investigation into Ultimatum Bargaining

Belief Updating in Sequential Games of Two-Sided Incomplete Information: An Experimental Study of a Crisis Bargaining Model

HOT VS. COLD: SEQUENTIAL RESPONSES AND PREFERENCE STABILITY IN EXPERIMENTAL GAMES * August 1998

Transcription:

Walras-Bowley Lecture 2003 (extended) Sergiu Hart This version: November 2005 SERGIU HART c 2005 p. 1

ADAPTIVE HEURISTICS A Little Rationality Goes a Long Way Sergiu Hart Center for Rationality, Dept. of Economics, Dept. of Mathematics The Hebrew University of Jerusalem http://www.ma.huji.ac.il/hart SERGIU HART c 2005 p. 2

Most of this talk is based on joint work with Andreu Mas-Colell (Universitat Pompeu Fabra, Barcelona) SERGIU HART c 2005 p. 3

Most of this talk is based on joint work with Andreu Mas-Colell (Universitat Pompeu Fabra, Barcelona) All papers and this presentation are available on my home page http://www.ma.huji.ac.il/hart SERGIU HART c 2005 p. 3

Papers http://www.ma.huji.ac.il/hart SERGIU HART c 2005 p. 4

Papers http://www.ma.huji.ac.il/hart Econometrica (2000) Journal of Economic Theory (2001) Economic Essays (2001) Games and Economic Behavior (2003) American Economic Review (2003) Econometrica (2005) SERGIU HART c 2005 p. 4

PART I Introduction: Dynamics SERGIU HART c 2005 p. 5

Dynamics SERGIU HART c 2005 p. 6

Dynamics Learning SERGIU HART c 2005 p. 6

Dynamics Learning START: prior beliefs STEP: observe update (Bayes) optimize (best-reply) REPEAT SERGIU HART c 2005 p. 6

Dynamics Evolution SERGIU HART c 2005 p. 7

Dynamics Evolution populations each individual fixed action ( gene ) frequencies of each action in the population mixed strategy SERGIU HART c 2005 p. 7

Dynamics Evolution populations each individual fixed action ( gene ) frequencies of each action in the population mixed strategy Change: Selection higher payoff higher frequency Mutation random and relatively rare SERGIU HART c 2005 p. 7

Dynamics Adaptive Heuristics SERGIU HART c 2005 p. 8

Dynamics Adaptive Heuristics rules of thumb myopic simple stimulus response, reinforcement behavioral, experiments non-bayesian SERGIU HART c 2005 p. 8

Dynamics Adaptive Heuristics rules of thumb myopic simple stimulus response, reinforcement behavioral, experiments non-bayesian Example: Fictitious Play SERGIU HART c 2005 p. 8

Dynamics Adaptive Heuristics rules of thumb myopic simple stimulus response, reinforcement behavioral, experiments non-bayesian Example: Fictitious Play (Play optimally against the empirical distribution of past play of the other player) SERGIU HART c 2005 p. 8

?. jevolution. jheuristics. Learning? SERGIU HART c 2005 p. 9

Rationality. jevolution. jheuristics. Learning Rationality SERGIU HART c 2005 p. 9

Rationality. jevolution. jheuristics. Learning Rationality most of this talk SERGIU HART c 2005 p. 9

Can simple adaptive heuristics lead to sophisticated rational behavior? SERGIU HART c 2005 p. 10

Game N-person game in strategic (normal) form Players i = 1, 2,..., N SERGIU HART c 2005 p. 11

Game N-person game in strategic (normal) form Players i = 1, 2,..., N For each player i: Actions a i in A i SERGIU HART c 2005 p. 11

Game N-person game in strategic (normal) form Players i = 1, 2,..., N For each player i: Actions a i in A i For each player i: Payoffs (utilities) u i (a) u i (a 1, a 2,..., a N ) SERGIU HART c 2005 p. 11

Dynamics Dynamics Time t = 1, 2,... SERGIU HART c 2005 p. 12

Dynamics Dynamics Time t = 1, 2,... At time t each player i chooses an action a i t in A i SERGIU HART c 2005 p. 12

PART II Regret Matching SERGIU HART c 2005 p. 13

Advertisement DON T YOU FEEL A PANG OF REGRET? 47.15% YIELD 47.15% January May 2003 17.92% 4.43% XYZ Nasdaq Dow Jones Don t wait! Ask your broker today Haaretz June 3, 2003 SERGIU HART c 2005 p. 14

Regret Matching REGRET MATCHING = Switch next period to a different action with a probability that is proportional to the regret for that action SERGIU HART c 2005 p. 15

Regret Matching REGRET MATCHING = Switch next period to a different action with a probability that is proportional to the regret for that action REGRET = increase in payoff had such a change always been made in the past SERGIU HART c 2005 p. 15

Regret U = average payoff up to now SERGIU HART c 2005 p. 16

Regret U = average payoff up to now V (k) = average payoff if action k had been played instead of the current action j every time in the past that j was played SERGIU HART c 2005 p. 16

Regret U = average payoff up to now V (k) = average payoff if action k had been played instead of the current action j every time in the past that j was played R(k) = [V (k) U] + = regret for action k SERGIU HART c 2005 p. 16

Regret U = average payoff up to now V (k) = average payoff if action k had been played instead of the current action j every time in the past that j was played R(k) = [V (k) U] + = regret for action k R(k) R i T (j k) = [ 1 T ) t T : a (u ] it = j i (k, a i t ) u i (a t ) + SERGIU HART c 2005 p. 16

Regret U = average payoff up to now V (k) = average payoff if action k had been played instead of the current action j every time in the past that j was played R(k) = [V (k) U] + = regret for action k R(k) R i T (j k) = [ 1 T ) t T : a (u ] it = j i (k, a i t ) u i (a t ) + SERGIU HART c 2005 p. 16

Regret U = average payoff up to now V (k) = average payoff if action k had been played instead of the current action j every time in the past that j was played R(k) = [V (k) U] + = regret for action k R(k) R i T (j k) = [ 1 T ) t T : a (u ] it = j i (k, a i t ) u i (a t ) + SERGIU HART c 2005 p. 16

Regret U = average payoff up to now V (k) = average payoff if action k had been played instead of the current action j every time in the past that j was played R(k) = [V (k) U] + = regret for action k R(k) R i T (j k) = [ 1 T ) t T : a (u ] it = j i (k, a i t ) u i (a t ) + SERGIU HART c 2005 p. 16

Regret U = average payoff up to now V (k) = average payoff if action k had been played instead of the current action j every time in the past that j was played R(k) = [V (k) U] + = regret for action k R(k) R i T (j k) = [ 1 T ) t T : a (u ] it = j i (k, a i t ) u i (a t ) + SERGIU HART c 2005 p. 16

Regret Matching Next period play: Switch to action k with a probability that is proportional to the regret R(k) (for k j) SERGIU HART c 2005 p. 17

Regret Matching Next period play: Switch to action k with a probability that is proportional to the regret R(k) (for k j) Play the same action j of last period with the remaining probability SERGIU HART c 2005 p. 17

Regret Matching Next period play: Switch to action k with a probability that is proportional to the regret R(k) (for k j) Play the same action j of last period with the remaining probability σ(k) σ i T+1 (k) = cr(k), for k j σ(j) σ i T+1 (j) = 1 k j cr(k) SERGIU HART c 2005 p. 17

Regret Matching Next period play: Switch to action k with a probability that is proportional to the regret R(k) (for k j) Play the same action j of last period with the remaining probability σ(k) σ i T+1 (k) = cr(k), for k j σ(j) σ i T+1 (j) = 1 k j cr(k) c = a fixed positive constant (so that the probability of not switching is > 0) SERGIU HART c 2005 p. 17

Regret Matching Theorem Theorem If all players play Regret Matching then the joint distribution of play converges to the set of CORRELATED EQUILIBRIA of the game SERGIU HART c 2005 p. 18

Joint Distribution of Play Joint distribution of play z T = The relative frequencies that the N-tuples of actions have been played up to time T SERGIU HART c 2005 p. 19

Joint Distribution of Play Joint distribution of play z T = The relative frequencies that the N-tuples of actions have been played up to time T SERGIU HART c 2005 p. 19

Joint Distribution of Play Joint distribution of play z T = The relative frequencies that the N-tuples of actions have been played up to time T T = 1 SERGIU HART c 2005 p. 19

Joint Distribution of Play Joint distribution of play z T = The relative frequencies that the N-tuples of actions have been played up to time T T = 1 0 0 0 1 0 0 z 1 SERGIU HART c 2005 p. 19

Joint Distribution of Play Joint distribution of play z T = The relative frequencies that the N-tuples of actions have been played up to time T T = 2 0 0 1/2 1/2 0 0 z 2 SERGIU HART c 2005 p. 19

Joint Distribution of Play Joint distribution of play z T = The relative frequencies that the N-tuples of actions have been played up to time T T = 3 0 0 2/3 1/3 0 0 z 3 SERGIU HART c 2005 p. 19

Joint Distribution of Play Joint distribution of play z T = The relative frequencies that the N-tuples of actions have been played up to time T T = 10 3/10 0 2/10 1/10 3/10 1/10 z 10 SERGIU HART c 2005 p. 19

Joint Distribution of Play Note 1: The fact that the players randomize independently at each period does not imply that the joint distribution is independent! T = 10 3/10 0 2/10 1/10 3/10 1/10 z 10 SERGIU HART c 2005 p. 19

Joint Distribution of Play Note 1: The fact that the players randomize independently at each period does not imply that the joint distribution is independent! Note 2: Players observe the joint distribution (the history of play) SERGIU HART c 2005 p. 19

Joint Distribution of Play Note 1: The fact that the players randomize independently at each period does not imply that the joint distribution is independent! Note 2: Players observe the joint distribution (the history of play) Note 3: Players react to the joint distribution (patterns, coincidences, communication, signals,...) SERGIU HART c 2005 p. 19

Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) SERGIU HART c 2005 p. 20

Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) Examples: SERGIU HART c 2005 p. 20

Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) Examples: Independent signals SERGIU HART c 2005 p. 20

Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) Examples: Independent signals Nash equilibrium SERGIU HART c 2005 p. 20

Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) Examples: Independent signals Nash equilibrium Public signals ( sunspots ) SERGIU HART c 2005 p. 20

Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) Examples: Independent signals Nash equilibrium Public signals ( sunspots ) convex combinations of Nash equilibria SERGIU HART c 2005 p. 20

Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) Examples: Independent signals Nash equilibrium Public signals ( sunspots ) convex combinations of Nash equilibria Butterflies play the Chicken Game ( Speckled Wood Pararge aegeria) SERGIU HART c 2005 p. 20

Correlated Equilibria "Chicken" game LEAVE STAY LEAVE 5, 5 3, 6 STAY 6, 3 0, 0 SERGIU HART c 2005 p. 21

Correlated Equilibria "Chicken" game LEAVE STAY LEAVE 5, 5 3, 6 STAY 6, 3 0, 0 a Nash equilibrium SERGIU HART c 2005 p. 21

Correlated Equilibria "Chicken" game LEAVE STAY LEAVE 5, 5 3, 6 STAY 6, 3 0, 0 another Nash equilibriumi SERGIU HART c 2005 p. 21

Correlated Equilibria "Chicken" game LEAVE STAY LEAVE 5, 5 3, 6 STAY 6, 3 0, 0 L 0 1/2 1/2 0 a (publicly) correlated equilibrium SERGIU HART c 2005 p. 21

Correlated Equilibria "Chicken" game LEAVE STAY LEAVE 5, 5 3, 6 STAY 6, 3 0, 0 L S L 1/3 1/3 S 1/3 0 another correlated equilibrium after signal L play LEAVE after signal S play STAY SERGIU HART c 2005 p. 21

Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) Examples: Independent signals Nash equilibrium Public signals ( sunspots ) convex combinations of Nash equilibria Butterflies play the Chicken Game ( Speckled Wood Pararge aegeria) SERGIU HART c 2005 p. 22

Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) Examples: Independent signals Nash equilibrium Public signals ( sunspots ) convex combinations of Nash equilibria Butterflies play the Chicken Game ( Speckled Wood Pararge aegeria) Boston Celtics front line SERGIU HART c 2005 p. 22

Correlated Equilibrium Signals (public, correlated) are unavoidable SERGIU HART c 2005 p. 23

Correlated Equilibrium Signals (public, correlated) are unavoidable Common Knowledge of Rationality Correlated Equilibrium (Aumann 1987) SERGIU HART c 2005 p. 23

Correlated Equilibrium Signals (public, correlated) are unavoidable Common Knowledge of Rationality Correlated Equilibrium (Aumann 1987) A joint distribution z is a correlated equilibrium u(j, s i )z(j, s i ) u(k, s i )z(j, s i ) s i s i for all i N and all j, k S i SERGIU HART c 2005 p. 23

Regret Matching Theorem [recall] Theorem If all players play Regret Matching then the joint distribution of play converges to the set of CORRELATED EQUILIBRIA of the game SERGIU HART c 2005 p. 24

Regret Matching Theorem CE = set of correlated equilibria z T = joint distribution of play up to time T distance(z T, CE) 0 as T (a.s.) SERGIU HART c 2005 p. 25

Regret Matching Theorem CE = set of correlated equilibria z T = joint distribution of play up to time T distance(z T, CE) 0 as T (a.s.) z T is approximately a correlated equilibrium (or z T is a correlated approximate equilibrium) from some time on (for all large enough T ) SERGIU HART c 2005 p. 25

Regret Matching Theorem Proof z T is a correlated equilibrium there is no regret: RT i (j k) = 0 for all players and all actions SERGIU HART c 2005 p. 26

Regret Matching Theorem Proof z T is a correlated equilibrium there is no regret: RT i (j k) = 0 for all players and all actions Regret Matching all regrets converge to 0 (Proof: Blackwell Approachability for the vector of regrets + approximate eigenvector probabilities by transition probabilities) SERGIU HART c 2005 p. 26

Regret Matching Theorem Proof z T is a correlated equilibrium there is no regret: RT i (j k) = 0 for all players and all actions Regret Matching all regrets converge to 0 (Proof: Blackwell Approachability for the vector of regrets + approximate eigenvector probabilities by transition probabilities) Note: z T converges to the set CE, not to a point SERGIU HART c 2005 p. 26

Approachability Approachability Setup: Payoffs are m-dimensional vectors (in R m ) Repeated game SERGIU HART c 2005 p. 27

Approachability Approachability Setup: Payoffs are m-dimensional vectors (in R m ) Repeated game Player i Other players ( i) ( ) ( ) ( ) ( ) ( ) ( ) (m = 2) SERGIU HART c 2005 p. 27

Approachability Theorem 1 Let H R m be a half-space. Then: SERGIU HART c 2005 p. 28

Approachability Theorem 1 Let H R m be a half-space. Then: Player i can guarantee that the long-run average payoff converges to H SERGIU HART c 2005 p. 28

Approachability Theorem 1 Let H R m be a half-space. Then: Player i can guarantee that the long-run average payoff converges to H Player i can guarantee that the one-shot payoff lies in H SERGIU HART c 2005 p. 28

Approachability Theorem 1 Let H R m be a half-space. Then: Player i can guarantee that the long-run average payoff converges to H H is APPROACHABLE Player i can guarantee that the one-shot payoff lies in H SERGIU HART c 2005 p. 28

Approachability Theorem 1 Let H R m be a half-space. Then: Player i can guarantee that the long-run average payoff converges to H H is APPROACHABLE Player i can guarantee that the one-shot payoff lies in H H is ENFORCEABLE SERGIU HART c 2005 p. 28

Approachability Theorem 1 Let H R m be a half-space. Then: Player i can guarantee that the long-run average payoff converges to H H is APPROACHABLE Player i can guarantee that the one-shot payoff lies in H H is ENFORCEABLE Proof: Real payoffs (m = 1): Minimax Theorem + Law of Large Numbers SERGIU HART c 2005 p. 28

Approachability Theorem 2 Let C R m be a convex set. Then: SERGIU HART c 2005 p. 29

Approachability Theorem 2 Let C R m be a convex set. Then: Player i can guarantee that the long-run average payoff converges to C SERGIU HART c 2005 p. 29

Approachability Theorem 2 Let C R m be a convex set. Then: Player i can guarantee that the long-run average payoff converges to C C is APPROACHABLE SERGIU HART c 2005 p. 29

Approachability Theorem 2 Let C R m be a convex set. Then: Player i can guarantee that the long-run average payoff converges to C C is APPROACHABLE Every half-space H that contains C is approachable SERGIU HART c 2005 p. 29

Approachability Theorem 2 Let C R m be a convex set. Then: Player i can guarantee that the long-run average payoff converges to C C is APPROACHABLE Every half-space H that contains C is approachable Every half-space H that contains C is enforceable SERGIU HART c 2005 p. 29

Approachability C SERGIU HART c 2005 p. 30

Approachability C H SERGIU HART c 2005 p. 30

Approachability C H SERGIU HART c 2005 p. 30

Approachability C H H SERGIU HART c 2005 p. 30

Approachability C H H SERGIU HART c 2005 p. 30

Approachability H C H H SERGIU HART c 2005 p. 30

Approachability H C H H SERGIU HART c 2005 p. 30

Approachability H H C H H SERGIU HART c 2005 p. 30

Approachability H H C H H SERGIU HART c 2005 p. 30

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability C SERGIU HART c 2005 p. 31

Approachability Note: The intersection of approachable sets need not be an approachable set. SERGIU HART c 2005 p. 32

Approachability Note: The intersection of approachable sets need not be an approachable set. ( 1 0 ) ( 0 1 ) SERGIU HART c 2005 p. 32

Approachability Note: The intersection of approachable sets need not be an approachable set. ( 1 0 ) ( 0 1 ) But: If all half-spaces containing C are approachable, then their intersection (= C) is approachable. SERGIU HART c 2005 p. 32

Remarks Correlating device: the history of play SERGIU HART c 2005 p. 33

Remarks Correlating device: the history of play Other procedures leading to correlated equilibria: Foster Vohra 1997 Calibrated Learning: best-reply to calibrated forecasts Fudenberg Levine 1999 Conditional Smooth Fictitious Play Eigenvector strategy SERGIU HART c 2005 p. 33

Remarks Correlating device: the history of play Other procedures leading to correlated equilibria: Foster Vohra 1997 Calibrated Learning: best-reply to calibrated forecasts Fudenberg Levine 1999 Conditional Smooth Fictitious Play Eigenvector strategy Not heuristics! SERGIU HART c 2005 p. 33

Behavioral Aspects Behavioral aspects of Regret Matching: SERGIU HART c 2005 p. 34

Behavioral Aspects Behavioral aspects of Regret Matching: Commonly used rules of behavior SERGIU HART c 2005 p. 34

Behavioral Aspects Behavioral aspects of Regret Matching: Commonly used rules of behavior Never change a winning team SERGIU HART c 2005 p. 34

Behavioral Aspects Behavioral aspects of Regret Matching: Commonly used rules of behavior Never change a winning team The higher would have been the payoff from another action the higher the tendency to switch to it SERGIU HART c 2005 p. 34

Behavioral Aspects Behavioral aspects of Regret Matching: Commonly used rules of behavior Never change a winning team The higher would have been the payoff from another action the higher the tendency to switch to it Small probability of switching (the status quo bias") SERGIU HART c 2005 p. 34

Behavioral Aspects Behavioral aspects of Regret Matching: Commonly used rules of behavior Never change a winning team The higher would have been the payoff from another action the higher the tendency to switch to it Small probability of switching (the status quo bias") Stimulus-response, reinforcement SERGIU HART c 2005 p. 34

Behavioral Aspects Behavioral aspects of Regret Matching: Commonly used rules of behavior Never change a winning team The higher would have been the payoff from another action the higher the tendency to switch to it Small probability of switching (the status quo bias") Stimulus-response, reinforcement No beliefs (defined directly on actions) No best-reply (better-reply?) SERGIU HART c 2005 p. 34

Behavioral Aspects Similar to models of learning, experimental and behavioral economics: SERGIU HART c 2005 p. 35

Behavioral Aspects Similar to models of learning, experimental and behavioral economics: Bush Mosteller 1955 Erev Roth 1995, 1998 Camerer Ho 1997, 1998, 1999... SERGIU HART c 2005 p. 35

Behavioral Aspects Similar to models of learning, experimental and behavioral economics: Bush Mosteller 1955 Erev Roth 1995, 1998 Camerer Ho 1997, 1998, 1999... SERGIU HART c 2005 p. 35

Behavioral Aspects Similar to models of learning, experimental and behavioral economics: Bush Mosteller 1955 Erev Roth 1995, 1998 Camerer Ho 1997, 1998, 1999... N. Camille et al, The Involvement of the Orbitofrontal Cortex in the Experience of Regret Science May 2004 (304: 1167 1170) SERGIU HART c 2005 p. 35

PART III Generalized Regret Matching SERGIU HART c 2005 p. 36

Questions How special is Regret Matching? SERGIU HART c 2005 p. 37

Questions How special is Regret Matching? Why does conditional smooth fictitious play work? SERGIU HART c 2005 p. 37

Questions How special is Regret Matching? Why does conditional smooth fictitious play work? Any connections? SERGIU HART c 2005 p. 37

Generalized Regret Matching Regret Matching = Switching probabilities are proportional to the regrets: σ(k) = cr(k) SERGIU HART c 2005 p. 38

Generalized Regret Matching Regret Matching = Switching probabilities are proportional to the regrets: σ(k) = cr(k) Generalized Regret Matching = Switching probabilities are a function of the regrets: σ(k) = f(r(k)) SERGIU HART c 2005 p. 38

Generalized Regret Matching Regret Matching = Switching probabilities are proportional to the regrets: σ(k) = cr(k) Generalized Regret Matching = Switching probabilities are a function of the regrets: σ(k) = f(r(k)) f is a sign-preserving function: f(0) = 0, and x > 0 f(x) > 0 f is a Lipschitz continuous function (in fact, much more general: f k,j, potential) SERGIU HART c 2005 p. 38

Generalized Regret Matching Theorem If all players play Generalized Regret Matching then the joint distribution of play converges to the set of correlated equilibria of the game SERGIU HART c 2005 p. 39

Generalized Regret Matching Theorem If all players play Generalized Regret Matching then the joint distribution of play converges to the set of correlated equilibria of the game Proof: Universal approachability strategies + Amotz Cahn, M.Sc. thesis, 2000 SERGIU HART c 2005 p. 39

Special Cases Play probabilities proportional to the m-th power of the regrets (f(x) = cx m, for m 1) SERGIU HART c 2005 p. 40

Special Cases Play probabilities proportional to the m-th power of the regrets (f(x) = cx m, for m 1) m = 1: Regret Matching SERGIU HART c 2005 p. 40

Special Cases Play probabilities proportional to the m-th power of the regrets (f(x) = cx m, for m 1) m = 1: Regret Matching m = : Positive probability only to actions with maximal regret SERGIU HART c 2005 p. 40

Special Cases Play probabilities proportional to the m-th power of the regrets (f(x) = cx m, for m 1) m = 1: Regret Matching m = : Positive probability only to actions with maximal regret Conditional Fictitious Play SERGIU HART c 2005 p. 40

Special Cases Play probabilities proportional to the m-th power of the regrets (f(x) = cx m, for m 1) m = 1: Regret Matching m = : Positive probability only to actions with maximal regret Conditional Fictitious Play But: Not continuous SERGIU HART c 2005 p. 40

Special Cases Play probabilities proportional to the m-th power of the regrets (f(x) = cx m, for m 1) m = 1: Regret Matching m = : Positive probability only to actions with maximal regret Conditional Fictitious Play But: Not continuous Therefore: Smooth Conditional Fictitious Play SERGIU HART c 2005 p. 40

PART IV Unknown Game SERGIU HART c 2005 p. 41

Unknown Game The case of the Unknown game : The player knows only Its own set of actions Its own past actions and payoffs SERGIU HART c 2005 p. 42

Unknown Game The case of the Unknown game : The player knows only Its own set of actions Its own past actions and payoffs The player does not know the game (other players, actions, payoff functions, history of other players actions and payoffs) SERGIU HART c 2005 p. 42

Proxy Regret Unknown game Unknown regret (The player does not know what the payoff would have been if he had played a different action k) SERGIU HART c 2005 p. 43

Proxy Regret Unknown game Unknown regret (The player does not know what the payoff would have been if he had played a different action k) Proxy Regret for k: Use the payoffs received when k has been actually played in the past SERGIU HART c 2005 p. 43

Proxy Regret Unknown game Unknown regret (The player does not know what the payoff would have been if he had played a different action k) Proxy Regret for k: Use the payoffs received when k has been actually played in the past Theorem. If all players play strategies based on proxy regret, then the joint distribution of play converges to the set of correlated equilibria of the game SERGIU HART c 2005 p. 43

PART V Uncoupled Dynamics SERGIU HART c 2005 p. 44

Nash Equilibrium Question: Adaptive heuristics Nash equilibria? SERGIU HART c 2005 p. 45

Nash Equilibrium Question: Adaptive heuristics Nash equilibria? In SPECIAL classes of games: YES SERGIU HART c 2005 p. 45

Nash Equilibrium Question: Adaptive heuristics Nash equilibria? In SPECIAL classes of games: YES Fictitious play, Regret-based,... Two-person zero-sum games Two-person potential games Supermodular games... SERGIU HART c 2005 p. 45

Nash Equilibrium Question: Adaptive heuristics Nash equilibria? In SPECIAL classes of games: YES Fictitious play, Regret-based,... Two-person zero-sum games Two-person potential games Supermodular games... In GENERAL games: NO SERGIU HART c 2005 p. 45

Nash Equilibrium Question: Adaptive heuristics Nash equilibria? In SPECIAL classes of games: YES Fictitious play, Regret-based,... Two-person zero-sum games Two-person potential games Supermodular games... In GENERAL games: NO WHY? SERGIU HART c 2005 p. 45

Uncoupled Dynamics General dynamic for 2-person games: ẋ(t) = F ( x(t), y(t) ; u 1, u 2 ) ẏ(t) = G ( x(t), y(t) ; u 1, u 2 ) x(t) (A 1 ), y(t) (A 2 ) SERGIU HART c 2005 p. 46

Uncoupled Dynamics General dynamic for 2-person games: ẋ(t) = F ( x(t), y(t) ; u 1, u 2 ) ẏ(t) = G ( x(t), y(t) ; u 1, u 2 ) Uncoupled dynamic: ẋ(t) = F ( x(t), y(t) ; u 1 ) ẏ(t) = G ( x(t), y(t) ; u 2 ) x(t) (A 1 ), y(t) (A 2 ) SERGIU HART c 2005 p. 46

Uncoupled Dynamics Adaptive ( rational ) dynamics (best-reply, better-reply, payoff-improving, monotonic, fictitious play, regret-based, replicator dynamics,...) are uncoupled SERGIU HART c 2005 p. 47

Uncoupled Dynamics Adaptive ( rational ) dynamics (best-reply, better-reply, payoff-improving, monotonic, fictitious play, regret-based, replicator dynamics,...) are uncoupled Uncoupledness is a natural informational condition SERGIU HART c 2005 p. 47

Nash-Convergent Dynamics Consider a family of games, each having a unique Nash equilibrium (no coordination problems ) SERGIU HART c 2005 p. 48

Nash-Convergent Dynamics Consider a family of games, each having a unique Nash equilibrium (no coordination problems ) A dynamic is Nash-convergent if it always converges to the unique Nash equilibrium Regularity conditions: The unique Nash equilibrium is a stable rest-point of the dynamic SERGIU HART c 2005 p. 48

Impossibility There exist no uncoupled dynamics which guarantee Nash convergence SERGIU HART c 2005 p. 49

Impossibility There exist no uncoupled dynamics which guarantee Nash convergence There are simple families of games whose unique Nash equilibrium is unstable for every uncoupled dynamic SERGIU HART c 2005 p. 49

Impossibility Adaptive ( rational ) dynamics (best-reply, better-reply, payoff-improving, monotonic, fictitious play, regret-based, replicator dynamics,...) are uncoupled SERGIU HART c 2005 p. 50

Impossibility Adaptive ( rational ) dynamics (best-reply, better-reply, payoff-improving, monotonic, fictitious play, regret-based, replicator dynamics,...) are uncoupled cannot always converge to Nash equilibria SERGIU HART c 2005 p. 50

Nash vs Correlated Correlated equilibria Uncoupled dynamics SERGIU HART c 2005 p. 51

Nash vs Correlated Correlated equilibria Uncoupled dynamics Nash equilibria Coupled dynamics SERGIU HART c 2005 p. 51

Nash vs Correlated Correlated equilibria Uncoupled dynamics Nash equilibria Coupled dynamics Law of Conservation of Coordination There must be coordination either in the equilibrium concept or in the dynamic SERGIU HART c 2005 p. 51

Summary SERGIU HART c 2005 p. 52

Where Do We Go From Here? SERGIU HART c 2005 p. 53

Where Do We Go From Here? Dynamics and equilibria Which equilibria? Which dynamics? SERGIU HART c 2005 p. 53

Where Do We Go From Here? Dynamics and equilibria Which equilibria? Which dynamics? Correlated equilibria: theory and practice Coordination Communication Bounded complexity SERGIU HART c 2005 p. 53

Where Do We Go From Here? Dynamics and equilibria Which equilibria? Which dynamics? Correlated equilibria: theory and practice Coordination Communication Bounded complexity Experiments, empirics Theory SERGIU HART c 2005 p. 53

Where Do We Go From Here? Dynamics and equilibria Which equilibria? Which dynamics? Correlated equilibria: theory and practice Coordination Communication Bounded complexity Experiments, empirics Theory Joint distribution of play (instead of just the marginals) SERGIU HART c 2005 p. 53

Summary SERGIU HART c 2005 p. 54

Summary Therejis a simple adaptive heuristic always leading to correlated equilibria SERGIU HART c 2005 p. 54

Summary Therejis a simple adaptive heuristic always leading to correlated equilibria (Regret Matching) SERGIU HART c 2005 p. 54

Summary Therejis a simple adaptive heuristic always leading to correlated equilibria (Regret Matching) There are many adaptive heuristics always leading to correlated equilibria SERGIU HART c 2005 p. 54

Summary Therejis a simple adaptive heuristic always leading to correlated equilibria (Regret Matching) There are many adaptive heuristics always leading to correlated equilibria (Generalized Regret Matching) SERGIU HART c 2005 p. 54

Summary Therejis a simple adaptive heuristic always leading to correlated equilibria (Regret Matching) There are many adaptive heuristics always leading to correlated equilibria (Generalized Regret Matching) There can be jno adaptive heuristics always leading to Nash equilibria SERGIU HART c 2005 p. 54

Summary Therejis a simple adaptive heuristic always leading to correlated equilibria (Regret Matching) There are many adaptive heuristics always leading to correlated equilibria (Generalized Regret Matching) There can be jno adaptive heuristics always leading to Nash equilibria (Uncoupledness) SERGIU HART c 2005 p. 54

Summary ADAPTIVE HEURISTICS SERGIU HART c 2005 p. 55

Summary BEHAVIORAL ADAPTIVE HEURISTICS SERGIU HART c 2005 p. 55

Summary BEHAVIORAL ADAPTIVE HEURISTICS CORRELATED EQUILIBRIA SERGIU HART c 2005 p. 55

Summary BEHAVIORAL ADAPTIVE HEURISTICS CORRELATED EQUILIBRIA Regret Matching SERGIU HART c 2005 p. 55

Summary BEHAVIORAL ADAPTIVE HEURISTICS CORRELATED EQUILIBRIA Generalized Regret Matching SERGIU HART c 2005 p. 55

Summary BEHAVIORAL ADAPTIVE HEURISTICS CORRELATED EQUILIBRIA Uncoupledness SERGIU HART c 2005 p. 55

Question Can simple adaptive heuristics lead to sophisticated rational behavior? SERGIU HART c 2005 p. 56

Answer Q Can simple adaptive heuristics lead to sophisticated rational behavior? YES! SERGIU HART c 2005 p. 56

Answer Q Can simple adaptive heuristics lead to sophisticated rational behavior? YES! in time... SERGIU HART c 2005 p. 56

Summary Macro BEHAVIORAL RATIONAL SERGIU HART c 2005 p. 57

Summary Macro BEHAVIORAL RATIONAL ADAPTIVE HEURISTICS SERGIU HART c 2005 p. 57

Title ADAPTIVE HEURISTICS (A Little Rationality Goes a Long Way) SERGIU HART c 2005 p. 58

Title ADAPTIVE HEURISTICS (A Little Rationality Goes a Long Way) Rationality Takes Time SERGIU HART c 2005 p. 58

Title ADAPTIVE HEURISTICS (A Little Rationality Goes a Long Way) Rationality Takes Time Regret?... No Regret! SERGIU HART c 2005 p. 58

Title ADAPTIVE HEURISTICS (A Little Rationality Goes a Long Way) Rationality Takes Time Regret?... No Regret! SERGIU HART c 2005 p. 58