Lec 3 - Game Theory and Economics
Lec 3 - Game Theory and Economics
Lec 3 - Game Theory and Economics
Module No. # 05
Extensive Games and Nash Equilibrium
Lecture No. # 03
Nash Equilibrium and its Problems
Keywords: Terminal history, outcome, strategic form, Nash equilibrium.
Welcome to the third lecture of fifth module of the course called Game Theory and
Economics. Before we start todays lecture, let me take you through what we have been
discussing in this module.
We have been discussing extensive games, where unlike strategic games the decisions
are taken one after another. Once a player is required to take a decision, he or she knows
exactly what decisions were made by the players, who were supposed to move or make a
decision before him or her. So, this is the basic idea of extensive games.
We have seen that there are four elements, which are needed to be mentioned when we
describe an extensive game. These four elements are, one, the set of players that is who
are the players that are involved; second is the set of terminal histories, a terminal history
is a sequence of actions. By terminal history one means a sequence of action, such that it
is not a subset of any other terminal history. This sequence can be a finite sequence, it
can be an infinite sequence either; thirdly, what is needed to be mentioned is what is
known as a player function.
A player function mentions, what is the identity of the player who is supposed to move
after a non-terminal history, which is a proper sub history of a terminal history. If I have
a non-terminal history, I need to know after that history has occurred who is the player
that will make a move, so that is known as a player function. A player function is defined
over a non-terminal history. Fourthly, obviously, we need to know how much the players
are liking or disliking a particular outcome. For example, people are taking some
decisions and they are reaching the end of the game. Now, if that game the end is
reached, then how much do you like that outcome or do you like some other sequence of
actions? These are known as the preferences of the players. And it is easy to see that
these preferences are defined over terminal histories, not over non-terminal histories.
This is the basic setting, now in case of sequential game or extensive game, if we have to
have a sort of idea of solution, as to which outcome will occur in a game theoretic
situation, then we have to have certain tools or certain conceptual tools. In the last class,
we started with one such tool which we called as a strategy.
A strategy is defined for a particular player, so strategy of a player. Suppose, the player
is i; i could be anyone 1, 2, dot, dot, dot, n. Now, what does this strategy tell us of a
particular player i? A strategy of player i should tell me what will be her action after
every non-terminal history h, after which its is turn to move or is turn to pick an
action. If I have h, a non-terminal history, if I know that P h is equal to i; that is P is the
player function here. If I apply this player function over h, I get that P h is equal to i.
Suppose that I have this set of actions, which can occur after this non-terminal history h,
then the strategy of i should tell me what is the action that i should take from this action
set, from this A h. A strategy of a player should tell me what is the action that player is
going to take after every h, a non-terminal history, after which its is turn to make a
move. Here, I am taking h, but there could be more than one h, there could be many h
after which is supposed to make a move.
A particular strategy of i should specify all those actions that he will take whenever he is
asked to make a move. If there are suppose five sub histories, after which i is supposed to
make a move, then strategy of i we shall denote it by s i, will consist of 5 elements. Each
element specifying what this player i is going to do after that particular sub history.
There could be many strategies of a particular player, if we club all these strategies
together of a particular player, if we make a collection of them, then that collection or set
will be denoted by capital S i and this will be called the strategy set of player i. This is a
conceptual tool we are developing to find out how we can use the ideas of equilibrium or
any solution in case of extensive game.
(Refer Slide Time: 08:37)
Now, we have defined what is the strategy set of a particular player capital S i. Similarly,
there could be strategy set for each of the players, here i is just a one player. In each of
the strategy sets, there are many strategies of all the players, now then we arrive at
another concept which is known as a strategy profile.
So, O is what is known as the outcome or we can also call this as outcome function.
Outcome function is defined over the strategy profile small s. If I have terminal history
or which is also being called an outcome, then I also know how much the players like or
dislike that terminal history, because remember preferences are defined over terminal
histories. So, I can talk about u i O s, so this will give me a number which represents the
preference of player i with respect to this outcome, which is being generated by this
strategy profile. We shall assume that we are having an ordinal preference, so that is why
I have written small u, so this is the setting.
Now, if this is the setting, then remember now we have a way to define Nash equilibrium
in this sort of extensive game. By directly using the concept of Nash equilibrium that we
developed in case of strategic game, because what is happening here is that here instead
of actions which we had in strategic games, we have strategies and like action profiles,
we have strategy profiles. From strategy profiles, we are getting an outcome, from the
outcome, we can specify the preference of the players, which is known as the players
payoff function u i.
One can very easily and directly use the idea of Nash equilibrium that we had developed
in the case of strategic games. Let me define it. So, this is the definition. A strategy
profile small s star in an extensive game with perfect information, is a Nash equilibrium,
if for every player i and every strategy s i of player i, the terminal history O s star
generated by s star, is at least as good according to is preferences as the terminal history
O s i s not i star generated by the strategy profile s i s not i star, in which player i chooses
s i, while every other player j chooses s j star.
(Refer Slide Time: 17:19)
So, this is the definition in terms of language. Symbolically we can write it as the
following that for every player i, it must be the case that u i So, this is the definition in
terms of simple symbols, u i is the payoff function of player i, which we know is defined
over the outcome or the terminal history. We are calling s star as the Nash equilibrium
strategy profile, if the players payoff from s star or from the outcome generated by s star
is at least as large as the payoff to player i, if he plays any other strategy s i and other
players that is not i are sticking to their equilibrium strategies that is other players are
playing s not i star. If other players are sticking to their star strategies or equilibrium
strategies, if any player deviates, then the payoff of that player cannot be more, it can be
either equal or worse, it can be less.
This is the idea of Nash equilibrium that we are having in extensive games. The
similarity of this definition with the definition of strategic games is very clear; there we
did not have this outcome terminal histories, those things were not there. Instead of
strategies what we had was action, so every player had an action. An action profile is
Nash equilibrium if for each player deviation from his or her equilibrium action is going
to reduce the payoff or keep it as it is, it can never lead to rise in the payoff. If that is
true, then the action profile will be called the equilibrium - Nash equilibrium action
profile.
(Refer Slide Time: 21:15)
Here, the difference is that in case of actions, we are having strategy and in place of
action profile, we are having a strategy profile, a strategy profile is generating an
outcome, from that outcome we are getting the payoffs and then we compare the payoffs.
So, this is how it is defined, Nash equilibrium in extensive games.
Now, while trying to apply this idea to solve extensive games or to find out which are the
Nash equilibria in extensive games, how do we proceed? We proceed by the following
way that we transform the extensive game into what is known as strategic form. Though
the game is extensive game, which means that the decisions are being taken one after
another, we can look the game as if it is a strategic game. In the sense that in a strategic
game, we know, we have two actions, so which are the actions in these extensive games?
We define the strategies of each player as the actions of that player. Correspondingly, we
shall get a strategy profile which is the equivalent of an action profile, we know that
from a strategy profile we can get the payoff, which is same as the payoff from an action
profile in a strategic game.
It is done in the following way, these transformations, the player in this strategic form is
the players in the extensive game. Remember, we are defining not an extensive game
anymore, we are defining a game which is a strategic game, but which has been got,
which has been obtained from an extensive game. If I have to define a strategic game,
first I have to have the set of players, the set of players remains the same, which means
that the set of players I had in the extensive game that becomes the set of players in
strategic game.
Actions. I have to mention which are the preferences of the players. Preferences are
the following; each players payoff to each action profile, is her payoff to the terminal
history generated by that action profile in the extensive game. We know that the
strategies of the players are same as the actions, so strategy profile is equal to action
profile and from the action profile I get terminal history, from the terminal history I
know what the payoff of the players is.
So that is how from an extensive game I can define the strategic form of that extensive
game. If it is defined in the following sense, then I have got basically in effect a strategic
game, from that strategic game, I can find out the Nash equilibrium. Let me give you an
example, let us take that entry game once again and find out the Nash equilibria of this
game. Remember, how this game look like, this is the challenger, either he gets in or he
keeps out. If he gets in then the incumbent can make a move, the incumbent can make
two sorts of moves: one is the incumbent fights or the incumbent accommodates. So, we
reached the end of the terminal histories.
If out is the terminal history, then the preferences are the following, it is 1 and 2, the
challenger gets 1, the incumbent gets 2, if in fight is the terminal history 0 0 and if in
accommodate is the terminal history, then we have 2 1, so this was the game.
What are the strategies of these players? Suppose, I want to write the strategy of the first
player that is the challenger, he has two strategies either in or out, the strategy set of the
incumbent either fight or accommodate. Now, I know that in the strategic form of this
extensive game, basically the actions are the strategies now, so action set now becomes
the strategy set. If I know these two action sets, then I can find the Nash equilibrium by
looking at the game in what is known as a normal form.
Here, I have this challenger and here is the incumbent (Refer Slide Time: 29:19). Now,
let us write down the payoffs to the players, if in fight is the terminal history, it is 0 0, in
accommodate 2 1, out fight, so you have this out and this fight, basically it means 1 2,
because the challenger is choosing out, the game is ending there. So, this is known as the
normal form representation of this strategic game, the strategic form that we have
derived from the extensive game. By looking at it, which are the Nash equilibria? We
can see that this is a Nash equilibrium in an accommodate and the other Nash
equilibrium is here, out and fight.
The other action profiles that is in fight and out accommodate are not Nash equilibrium.
Nash equilibrium of this game are the following in accommodate and out fight. Now,
this was what we got from this matrix, but how does it transform in terms of this game
tree, if we look at the game tree how does it look like? In and accommodate, so this and
this (Refer Slide Time: 31:40). Why is this a Nash equilibrium? Because from the
challengers point of view he is getting 2 in this terminal history, if he changes his
strategy that is if he chooses out in place of in, then he is getting 1, 1 is less than 2, so
from the challengers point of view this is optimal. From the incumbents point of view,
the incumbent is getting 1, now what else he could do?
(Refer Slide Time: 26:09)
The incumbent could say that I will fight instead of accommodate, but if he fights given
the challengers action that is in the incumbent is getting 0 and 0 is less than 1, so it is
optimal for the incumbent to choose accommodate. Therefore, from the point of view of
both the players in accommodate is equilibrium, its optimal. What about out fight?
Basically, we are talking about this, out fight. Now, here what is the payoff to the player
challenger? He is getting 1, because we are basically here after this terminal history out.
So, the challenger is getting 1, the incumbent is getting 2. Can the challenger be better
off? Well, if the challenger changes her strategy and chooses in, then given that the
incumbent is saying I am going to fight you, the challenger will get 0, 0 is less than 1.
From the challengers point of view given that the incumbent is going to fight him, it is
better to stay out. Is this equilibrium or optimal from incumbents point of view? Well,
yes, because the challenger is choosing out, so incumbent is getting 2. Now, it does not
really matter whether the incumbent chooses fight or accommodate, because the outcome
is remaining at out. He does not make any difference to the payoff that he is getting,
anyway he is getting the maximum payoff that is possible for him, which is 2. This is an
equilibrium, this is also an equilibrium, so that is why both of them are Nash equilibrium
(Refer Slide Time: 34:34).
Now, these are well and good, but remember the game we are having here is an
extensive game, where the decisions are taken one after another. The idea that we are
trying to apply is the idea of Nash equilibrium and what is Nash equilibrium? It is a
steady state.
Given the actions taken by the other players I form beliefs regarding their actions in
future and with respect to that I take my optimal action and that is applicable for each
and every player. So that is why we call it a steady state, but since the game is an
extensive game, so the decisions are coming stage by stage in a sequential manner, there
is a problem with this equilibrium out and fight.
Why is this a problem? Because the terminal history we are getting here is basically here,
the game is ending here (Refer Slide Time: 35:35). Now, in Nash equilibrium, the
concept was that you observe the action of the other players and then you form a belief
that this is the action that he is going to take. But if the game is terminating here, how
can the challenger know that the incumbent is going to fight? But, in this equilibrium I
am saying that the incumbent is going to fight in the equilibrium, whereas the games
terminal history is here, equilibrium terminal history is here.
We are further saying that there is a problem of interpretation of out and accommodate,
if we apply the idea of Nash equilibrium, the way we have been interpreting it so far,
because the terminal history it traces out is this and this, which means that the game is
ending here, whereas the equilibrium strategy profile is telling me that the incumbent is
going to fight, which the challenger never gets to see.
If he never gets to see an action how does he form the belief that the incumbent is going
to fight? For example, how does the incumbent convince the challenger that I am going
to fight you, because the challenger is never seeing the actions of the incumbent. Now,
one way to justify such kind of behavior, which is sort of out of the terminal history
induced by the strategy profile, this is the terminal history induced by this strategy
profile, now, out of this terminal history how do we justify the action, which is fight
here? One way to justify is that equilibrium action profile or strategy profile is out an
accommodate, but it may happen that the challenger sometimes experiments or he may
be committing errors. Now, if he does experiments or if he commits errors, which means
that he sometimes may choose in, though out is the equilibrium, he sometimes
experiments and chooses in and then he observes that the incumbent is fighting with him.
Now, since he is convinced about the action of the other player by doing experiments, so
that is why sorry this was wrong in accommodate out fight.
If the challenger is sometimes doing experiments and choosing in, then he is observing
the incumbent to be fighting that is why he forms the belief that the incumbent will fight
with him, if he chooses in, because remember what does this fight mean? Fight means
that the incumbent is saying that if the history is in, then I will fight. Now, the challenger
is doing some experiments or making errors, choosing in, he is observing that the
incumbent is fighting with him, therefore this out and fight could be justified.
(Refer Slide Time: 37:16)
Now, the problem again is that if the challenger is getting in, then for the incumbent to
choose fight is suboptimal. Because, remember if the sub history is in, then the
incumbent has to choose between fight and accommodate, and 1 is greater than 0. So, it
is optimal for the incumbent not to fight, it is optimal for him to accommodate.
Therefore, this sort of strategy profile that we are getting through Nash equilibrium is
what we call is its not robust. What we mean by this term robust is that if the challenger
deviates from the equilibrium strategy, then the strategy which is mentioned in the
strategy profile - the equilibrium profile, is not in fact going to be followed by the other
players. For the incumbent in this case fighting is not optimal, though it is written, it is
coming out as the Nash equilibrium strategy profile.
So that is why we are saying that a Nash equilibrium strategy profile in extensive game
may not be robust, but see in case of in accommodate there is no such problem. In in
accommodate, the challenger is getting in, the incumbent is accommodating him. There
is no problem here in terms of not observation, because this is the terminal history we are
tracing out and all the strategies mentioned in the strategy profile are on the terminal
history we are tracing out. But, in the previous case, this was the strategy profile,
whereas the terminal history was this one itself, this was not observed. So, there is a
problem of interpretation in the sense that in the extensive form game people take
decisions stage by stage. Now, had it been the case that the players commit their actions
in the beginning of the game itself? For example, suppose incumbent told the challenger
in the beginning of the game itself that I will fight with you, if you get in that is my
strategy. If that had been a binding commitment, then there would not have been any
problem in this case, then obviously the challenger is going to stay out, because he
knows that if he gets in, the incumbent has a binding agreement that he will fight. Given
that it is better to stay out, but in an extensive game, the things are moving stage by
stage, there is no binding agreement here. Here, if the challenger in fact gets in, then it is
not optimal, truly speaking, it is not optimal for the incumbent to fight with him, so once
it has happened, it is suboptimal for incumbent to fight.
Therefore, this kind of strategy that after in I will fight seems to be an empty threat, it is
a non-credible strategy. So, we call it as non-credible threat that the incumbent is posing
to the challenger that look if better you stay out, because if you get in, I will fight with
you. It is a threat, but it is non-credible, because going by the theory of rational choice, if
indeed the challenger gets in, the incumbent is not going to fight, he is going to
accommodate.
Given the fact that people cannot commit in the beginning of the game itself that what
they will do in the subsequent stages, this equilibrium is ruled out or what we say, it is
not robust, if we take that fact into consideration that people may not be able to commit
in the beginning of the game itself. Though this is an equilibrium in the Nash equilibrium
sense that given the other player is saying I am going to fight, the challenger is not
looking into the structure of the game and he is just saying you are going to fight, let me
stay out, because that is optimal for me. But, if he thinks more carefully and if he looks
into the structure, he sees that if he gets in it is suboptimal for the incumbent to fight,
then he will figure out that this is not credible; that is why we are saying is a non credible
threat.
We have to have a better notion or better idea of equilibrium, which kind of rules out this
sort of Nash equilibrium, because this is a Nash equilibrium, there is no doubt about it,
but we want to rule out this Nash equilibrium and the ruling out is becoming necessary,
because the game we are having now is a sequential game. The decisions are taken one
after another, so people might renege, people might say something in the beginning and
they can change the decisions after some stages are being played. Since there is this
possibility of reneging, we have to have a better idea or better notion of equilibrium,
which is different or which we say is a refinement over the notion of Nash equilibrium.
Before we do that let us do some exercises to find out how Nash equilibrium is solved.
Let me take this game, there are two players here: player 1 and player 2, player 1 has two
actions in stage 1, then player 2 takes actions and player 2s actions are E, F, G, H and
then the game ends. So, these are the terminal histories C E, C F, D G and D H. Player 1
is moving from the first, after the history phi; player 2 is moving after two sorts of sub
histories, they are C and D.
What about the payoff functions? C F is best for player 1, so this is let us call it 3,
Second best is D G 2, third best is C E 1 and fourth is D H 0. For player 2, D G is best,
then C F, then D H and then C E. Suppose, this is the game that is given, then suppose I
have to find out, what are the Nash equilibria or what is the Nash equilibrium in this
game? Now, to know that first I have to find out what is the set of actions for player 1 C
and D, for player 2, the set of strategies are the following EG, EH, FG, FH.
What about the other Nash equilibria? Let me draw, this is a Nash equilibrium C FG. C
FG is Nash equilibrium, because if player 1 deviates and plays D, he is going to get 2,
which is less. If player 2 deviates, he is going to get 0, which is less than 2, so that is
Nash equilibrium.
We can see that this is also Nash equilibrium C FH; Here, if player 1 deviates to D, he is
going to get 0 less than 3, if player 2 deviates, he is going to 0 less than 2, so these are
the Nash equilibria of this game. Let me write it down, it is C FG, C FH and D EG. So,
this is where we shall stop today. What we have done in this class is that we have
developed the idea of Nash equilibrium in case of extensive game, we have solved a
problem, how to apply this idea two problems in fact. We have started out with the
idea that one has to refine the Nash equilibrium concept in extensive game, because it
can give us non robust equilibria which have a problem of interpretation. So, we shall
develop that idea in the next lecture. Thank you.
(Refer Slide Time: 57:18)
Find the Nash equilibria in the following game. Actions of A of 1 after the empty history
are A and B: so this is A, this is B; of 2 are C and D; of 1 after A C history has occurred
is E and F, so this is E, this is F.
Now, how do we find the Nash equilibria of an extensive game with perfect information?
First, we find out the strategic form of this game. For this, the players in the strategic
form like before it is 1 and 2. Set of actions for 1, what are the actions? Basically the
action set is now the set of strategy, we know that there are basically. let me draw the
game tree this is A B, this is C D, E F, what are the strategies of player 1? It could be
AE, it could be AF, BE, BF and set of actions of player 2 -- C and D. Preferences from
the action profile is same as a preferences from the strategy profiles from this extensive
game.
Now, we can draw the payoff matrix and find out what are the Nash equilibria. This is
the payoff matrix, we can see that there are four Nash equilibria here; these are AE D,
AF D, BE C and BF C, this one, this one, this one and this one (Refer Slide Time:
01:01:03).
Explain how the concept of Nash equilibrium may not give us robust steady state in
extensive games with perfect information.
To show this that in extensive game with perfect information Nash equilibrium may not
give us robust steady state, let us take the example of this game itself; this game in
exercise one and let us look at a particular Nash equilibrium which is suppose BF C.
Now, this is a Nash equilibrium, we know that because given player 1 is playing B, in
which player 1 is getting 2, player 2 is getting 0. Player 2 can choose either C or D, it
does not matter, because he is anyway getting 0 when player 1 is taking the action B, so
it does not matter what action he chooses C or D.
Now, this is not a robust steady state, because suppose player 1 instead of choosing B
suppose he chooses A, player 1 chooses A, then for 2 C is suboptimal, it is better for him
to choose D; because, by choosing D, he is going to get 2, by choosing C, he is going to
get 0. Here, player 2s action is not optimal in this Nash equilibrium, secondly player 2 is
saying that I am going to play C, but this action of player 2 cannot be confirmed by
player 1, because that action never gets played, because player 1 has already chosen B.
In fact, if player 1 makes some experiment and chooses A, player 2 is never going to
choose C, therefore this action of player 2 which is C, it cannot be sustained if there are
some deviation from this action B. So that is why we say it is not a robust steady state.
Thank you.