05 Game Playing
05 Game Playing
05 Game Playing
.in
rs
de
ea
yr
www.myreaders.info/ , RC Chakraborty, e-mail [email protected] , June 01, 2010
www.myreaders.info/html/artificial_intelligence.html
www.myreaders.info
ha
kr
ab
or
ty
,w
.m
Return to Website
Game Playing
Artificial Intelligence
Game
Playing,
topics
theory, relevance of
Overview,
definition
of
game,
game
function,
mixed
strategies,
expected
payoff,
Mini-Max
static evaluation,
fo
.in
rs
de
ea
,w
.m
yr
Game Playing
ha
kr
ab
or
ty
Artificial Intelligence
Topics
(Lectures 29, 30,
Slides
2 hours)
1. Overview
03-18
Formalizing game :
General
19-25
and
Tic-Tac-Toe
game, Evaluation
26-32
33-35
Alpha-cutoff, Beta-cutoff
5. References
02
36
fo
.in
rs
de
ea
yr
What is Game ?
ha
kr
ab
or
ty
,w
.m
Game Playing
words,
players
determine
their
own
strategies
in
terms
of
fo
.in
rs
de
ea
Overview
or
ty
,w
.m
yr
1. Over View
Game playing, besides the topic of attraction
to
ha
kr
ab
and
However, for
practically used.
Applications of game theory are wide-ranging. Von Neumann and Morgenstern
indicated the utility of game theory by linking with economic behavior.
Economic models : For markets of various commodities with differing
numbers of buyers and sellers, fluctuating values of supply and demand,
seasonal and cyclical variations, analysis of conflicts of interest in maximizing
profits and promoting the widest distribution of goods and services.
Social sciences : The n-person game theory has interesting uses in studying
the distribution of power in legislative procedures, problems of majority rule,
individual and group decision making.
Epidemiologists :
fo
.in
rs
de
ea
Overview
or
ty
,w
.m
yr
kr
ab
ha
After first move, the new situation determines which player to make next
move and alternatives available to that player.
In many board games, the next move is by other player.
In many multi-player card games, the player making next move
depends on who dealt, who took last trick, won last hand, etc.
The moves made by a player may or may not be known to other players.
Games in which all moves of all players are known to everyone are called
games of perfect information.
Most board games are games of perfect information.
Most card games are not games of perfect information.
draw =
loss
05
1 point,
0 points, and
= -1 points.
fo
.in
rs
de
ea
Overview
or
ty
,w
.m
yr
kr
ab
ha
by their decisions.
General game theorem : In every two player, zero sum, non-random,
perfect knowledge game, there exists a perfect strategy guaranteed to
at least result in a tie game.
The frequently used terms :
The term "game" means a sort of conflict in which n individuals or
groups (known as players) participate.
A list of "rules" stipulates the conditions under which the game begins.
A game is said to have "perfect information" if all moves are known to
each of the players involved.
A "strategy" is a list of the optimal choices for each player at every
stage of a given game.
A "move" is the way in which game progresses from one stage to
another, beginning with an initial state of the game to the final state.
The total number of moves constitute the entirety of the game.
The payoff or outcome, refers to what happens at the end of a game.
Minimax - The least good of all good outcomes.
Maximin - The least bad of all bad outcomes.
The primary game theory is the Mini-Max Theorem. This theorem says :
"If a Minimax of one player corresponds to a Maximin of the other
player, then that outcome is the best both players can hope for."
06
fo
.in
rs
de
ea
Overview
or
ty
,w
.m
yr
relevant
ha
kr
ab
Game Playing
Games can be Deterministic or non-deterministic.
Games can have perfect information or imperfect information.
07
Games
Deterministic
Non- Deterministic
Perfect
information
Backgammon,
Monopoly
Imperfect
information
Navigating
a maze
Bridge, Poker,
Scrabble
fo
.in
rs
de
ea
Overview
or
ty
,w
.m
yr
kr
ab
Denotes
games
of
strategy.
It
allows
decision-makers
(players)
ha
08
Players
Strategies
Player A
Strategy 1
Player A
Strategy 2
Player A
Strategy 3
etc
Player B
Strategy 1
Tie
A wins
B wins
...
Player B
Strategy 2
B wins
Tie
A wins
...
Player B
Strategy 3
A wins
B wins
Tie
...
etc
...
...
...
...
fo
.in
rs
de
ea
yr
.m
w
w
,w
ty
Zero-Sum Game
Overview
ab
or
ha
kr
fo
.in
rs
de
ea
yr
.m
w
w
,w
ty
Overview
Constant-Sum Game
ab
or
ha
kr
fo
.in
rs
de
ea
yr
or
ty
,w
.m
Overview
Prisoner's Dilemma
It is a two-person nonzero-sum game. It is a non cooperative game
kr
ab
ha
Example : The two players are partners in a crime who have been captured by
the police. Each suspect is placed in a separate cell and offered the
opportunity to confess to the crime.
Now set up the payoff matrix. The entries in the matrix are two numbers
representing the payoff to the first and second player respectively.
Players
2nd player
Not Confess
2nd player
Confess
1st player
Not Confess
5 ,5
0 , 10
1st player
Confess
10 , 0
1 , 1
If neither suspect confesses, they go free, and split the proceeds of their
crime, represented by 5 units of payoff for each suspect.
If one prisoner confesses and the other does not, the prisoner who
confesses testifies against the other in exchange for going free and gets the
entire 10 units of payoff, while the prisoner who did not confess goes to
prison and gets nothing.
If both prisoners confess, then both are convicted but given a reduced
term, represented by 1 unit of payoff : it is better than having just the
other prisoner confess, but not so good as going free.
This game represents many important aspects of game theory.
No matter what a suspect believes his partner is going to do, it is always best
to confess.
This conflict between the pursuit of individual goals and the common
good is at the heart of many game theoretic problems.
11
fo
.in
rs
de
ea
yr
.m
w
w
,w
ty
N-Person Game
Overview
ab
or
ha
kr
fo
.in
rs
de
ea
yr
.m
w
w
,w
ty
Overview
Mixed Strategies
kr
ab
or
ha
m-vector,
X = (x1, , xm) ,
satisfying
xi 0 ,
i=1
xi
=1
Now denote the set of all mixed strategies for player-1 by X, and
the set of all mixed strategies for player-2 by Y.
X = { x = (xi , . . . , xm) : xi 0 ,
xi
=1
i=1
Y = { y = (yi , . . . , yn) : yi
[continued in the next slide]
13
0 ,
i=1
yi = 1
fo
.in
rs
de
ea
yr
.m
w
w
,w
ty
Overview
Expected Payoff
ab
or
Suppose that player1 and player2 are playing the matrix game A.
ha
kr
a11
.
.
am1
X1
xm
.
.
.
.
.
.
.
.
.
.
That is
.
.
.
.
.
.
.
.
.
.
yn
y1
a1n
.
.
a1n
A(x , y) =
X1
X1a11 y1
Xm
Xmam1 y1
i=1
or in matrix form
.
.
j=1
T
.
.
.
.
.
..
..
..
..
...
.
.
.
.
.
yn
X1a1n yn
.
.
Xmamn yn
xi aij yj
A(x , y) = x A y
Maximin Strategy :
Assume that player1 uses x, and player2 chooses y to minimize A(x, y);
player1's expected gain will be
V (x) = min x
j
Aj
= max min x
xX
Aj
Player2's
Minimax Strategy :
T
V (y) = max Ai y
i
minimax strategy.
fo
.in
rs
de
ea
yr
or
ty
,w
.m
Overview
kr
ab
Players adopt those strategies which will maximize their gains, while
ha
minimizing their losses. Therefore the solution is, the best each
V1 = V2
fo
.in
rs
de
ea
yr
.m
w
w
,w
ty
Overview
Saddle Point
An element
aij
of a
matrix
is
if
it
ha
kr
ab
or
5
3
-3
B
1
2
0
3
4
1
4
0
6
3
1
3
5
0
9
-1
1
1
-1
no saddle point
The game payoff matrix B shows that saddle point may not be
unique, but the optimal payoff is unique.
A payoff matrix needn't have a saddle point, but if it does, then
the usually minimax theorem is easily shown to hold true.
The Zero Sum Games modeled in matrices are solved by finding the
saddle point solution.
[continued in the next slide]
16
fo
.in
rs
de
ea
yr
.m
w
w
,w
ty
or
ab
ha
kr
5
3
-3
1
2
0
3
4
1
Max
Overview
min
1
2
-3
min
4
0
6
3
1
3
5
0
9
Max
3
0
1. Find your min payoff : Label each row at its end with its minimum
payoff. This way you'll define your worst case scenarios when
choosing a strategy.
2. Find your opponent's min payoff : Label each column at its bottom
with its maximum payoff. This will show the worst case scenarios for
your opponent.
3. Find out which is the highest value in the series of minimum values.
It is at one place as 2 in matrix A and at two places as 3 in matrix B.
4. Then find out which is the lowest value in the series of maximum
values. It is as 2 in matrix A and as 3 in matrix B.
5. Find out if there is a minimax solution : If these two values match,
then you have found the saddle point cell.
6. If there is a minimax solution, then it is possible that both agents
choose the corresponding strategies. You and your opponent are
maximizing the gain that the worst possible scenario can drive.
17
fo
.in
rs
de
ea
Overview
kr
ab
or
ty
,w
.m
yr
ha
Game Theory
Games of Skill
Games of Chance
Games of Strategy
Mixed-motive
Cooperative
Purely
cooperative
Minimal
social
situation
Perfect info
Infinite
Two-persons
Zero-sum
Coalitions not
permitted
Finite
Noncooperative
Cooperative
Essential
Coalitions
Non-essential
Coalitions
saddle Non-saddle
Imperfect info
Symmetric games
Have optimal
equilibrium points
18
Multi-persons
Mixed
strategy
Have no optimal
equilibrium points
fo
.in
rs
de
ea
Mini-Max Search
or
ty
,w
.m
yr
ha
kr
ab
fo
.in
rs
de
ea
yr
or
ty
,w
.m
Mini-Max Search
Adversary Methods
Required because alternate moves are made by an opponent ,
ha
kr
ab
f(n)
fo
.in
rs
yr
ea
de
ab
or
ty
,w
.m
Mini-Max Search
Zero-Sum Assumption
ha
kr
fo
.in
rs
de
ea
Mini-Max Search
or
ty
,w
.m
yr
kr
ab
ha
fo
.in
rs
de
ea
yr
or
ty
,w
.m
Mini-Max Search
ha
kr
ab
If
my
turn
to move, then
indicating it is my turn;
otherwise
it
is
labeled
MIN node
to
indicate
it
is
my
opponent's turn.
Arcs
represent
i+1
fo
.in
rs
yr
ea
de
ab
or
ty
,w
.m
Mini-Max Search
Mini-Max Algorithm
ha
kr
Since it's my turn to move, the start node is MAX node with
current board configuration.
Expand nodes down (play) to some depth of look-ahead in the
game.
Apply evaluation function at each of the leaf nodes
"Back up" values for each non-leaf nodes until computed for the
root node.
At MIN nodes, the backed up value is the minimum of the values
associated with its children.
At MAX nodes, the backed up value is the maximum of the
values associated with its children.
Note: The process of "backing up" values gives the optimal strategy,
that is, both players assuming that your opponent is using the
same static evaluation function as you are.
24
fo
.in
rs
de
ea
yr
,w
.m
Mini-Max Search
or
ty
5 S max
ha
kr
ab
2 C min
-1 A
G H
I J
3 -1
6 5
2 7
then it will pick the move associated with the arc from A to F.
Similarly, B's backed-up value is 5 and
C's backed-up value is 2.
fo
.in
rs
de
ea
or
ty
,w
.m
yr
kr
ab
ha
26
fo
.in
rs
de
ea
yr
.m
w
w
,w
ty
or
ab
kr
ha
C
C
27
fo
.in
rs
de
ea
yr
.m
w
w
,w
ty
or
ab
kr
ha
C
C
28
fo
.in
rs
de
ea
+1
for a win,
for a draw
ab
or
ty
,w
.m
yr
ha
kr
29
fo
.in
rs
de
ea
ab
or
ty
,w
.m
yr
ha
kr
Up : One Level
30
fo
.in
rs
de
ea
yr
.m
w
w
,w
ty
or
ab
kr
ha
C
C
31
Up : Two Levels
fo
.in
rs
de
ea
ha
kr
ab
or
ty
,w
.m
yr
Best move
32
fo
.in
rs
de
ea
Alpha-beta pruning
or
ty
,w
.m
yr
4. Alpha-Beta Pruning
The problem with Mini-Max algorithm is that the number of game states
ha
kr
ab
Max-player cuts off search when he knows Min-player can force a provably
bad outcome.
Min player cuts of search when he knows Max-player can force provably
good (for max) outcome
33
fo
.in
rs
de
ea
Alpha-cutoff pruning
or
ty
,w
.m
yr
5.1 Alpha-Cutoff
It may be found that, in the current branch, the opponent can achieve a
kr
ab
state with a lower value for us than one achievable in another branch. So
ha
the current branch is one that we will certainly not move the game to.
34
fo
.in
rs
de
ea
Beta-cutoff
or
ty
,w
.m
yr
5.2 Beta-Cutoff
It is just the reverse of Alpha-Cutoff.
kr
ab
ha
achieve a state which has a higher value for us than one the opponent can
35
Search in this
fo
.in
rs
de
ea
yr
.m
w
or
ty
,w
6. References : Textbooks
kr
ab
1. "Artificial Intelligence", by Elaine Rich and Kevin Knight, (2006), McGraw Hill
ha
4. "AI: A New Synthesis", by Nils J. Nilsson, (1998), Morgan Kaufmann Inc., Chapter
12, Page 195-213.
36
An exhaustive list is