10.5.1 Game Trees
In most literature, sequential games are formulated in terms of game trees. A state-space representation, which is more in alignment
with the representations used in this chapter, will be presented in
Section 10.5.2. The tree representation is commonly referred
to as the extensive form of a game
(as opposed to the normal form, which is
the cost matrix representation used in Chapter 9).
The representation is helpful for visualizing many issues in game
theory. It is perhaps most helpful for visualizing information
states; this aspect of game trees will be deferred until Section
11.7, after information spaces have been formally
introduced. Here, game trees are presented for cases that are simple
to describe without going deeply into information spaces.
Figure 10.12:
A
matrix game expressed using
a game tree.
|
Before a sequential game is introduced, consider representing a
single-stage game in a tree form. Recall Example 9.14,
which is a zero-sum,
matrix game. It can be represented
as a game tree as shown in Figure 10.12. At the
root,
has three choices. At the next level,
has three
choices. Based on the choices by both, one of nine possible leaves
will be reached. At this point, a cost is obtained, which is written
under the leaf. The entries of the cost matrix,
(9.53), appear across the leaves of the tree. Every
nonleaf vertex is called a decision vertex: One player must select an action.
There are two possible interpretations of the game depicted in Figure
10.12:
- Before it makes its decision,
knows which action was
applied by
. This does not correspond to the zero-sum game
formulation introduced in Section 9.3 because
seems as powerful as nature. In this case, it is not equivalent to the
game in Example 9.14.
-
does not know the action applied by
. This is
equivalent to assuming that both
and
make their decisions
at the same time, which is consistent with Formulation
9.7. The tree could have alternatively been represented
with
acting first.
Now imagine that
and
play a sequence of games. A
sequential version of the zero-sum game from Section 9.3
will be defined by extending the game tree idea given so far to more
levels. This will model the following sequential game:
Formulation 10..3 (Zero-Sum Sequential Game in Tree Form)
- Two players,
and
, take turns playing a game. A
stage as considered previously is now stretched into two substages, in which each player acts individually. It is usually
assumed that
always starts, followed by
, then
again,
and so on. Player alternations continue until the game ends. The
model reflects the rules of many popular games such as chess or poker.
Let
denote the set of stages at which
and
both take a turn.
- As each player takes a turn, it chooses from a nonempty, finite
set of actions. The available set could depend on the decision vertex.
- At the end of the game, a cost for
is incurred based on
the sequence of actions chosen by each player. The cost is
interpreted as a reward for
.
- The amount of information that each player
has when making its decision must be specified. This is usually
expressed by indicating what portions of the action histories are
known. For example, if
just acted, does
know its choice?
Does it know what action
chose in some previous stage?
The game tree can now be described in detail. Figure
10.13 shows a particular example for two stages (hence, and
). Every vertex corresponds to a point at
which a decision needs to be made by one player. Each edge emanating
from a vertex represents an action. The root of the tree indicates
the beginning of the game, which usually means that
chooses an
action. The leaves of the tree represent the end of the game, which
are the points at which a cost is received. The cost is usually shown
below each leaf. One final concern is to specify the information
available to each player, just prior to its decision. Which actions
among those previously applied by itself or other players are known?
Figure 10.13:
A two-player, two-stage game expressed
using a game tree.
|
For the game tree in Figure 10.13, there are two players and
two stages. Therefore, there are four levels of decision vertices. The
action sets for the players are
, for ``left'' and
``right.'' Since there are always two actions, a binary tree is
obtained. There are possible outcomes, which correspond to all
pairwise combinations of four possible two-stage plans for each
player.
For a single-stage game, both deterministic and randomized strategies
were defined to obtain saddle points. Recall from Section
9.3.3 that randomized strategies were needed to
guarantee the existence of a saddle point. For a sequential game,
these are extended to deterministic and randomized
plans, respectively. In Section
10.1.3, a (deterministic) plan was defined as a mapping
from the state space to an action space. This definition can be
applied here for each player; however, we must determine what is a
``state'' for the game tree. This depends on the information that
each player has available when it plays.
A general framework for representing information in game trees is
covered in Section 11.7. Three simple kinds of
information will be discussed here. In every case, each player knows
its own actions that were applied in previous stages. The differences
correspond to knowledge of actions applied by the other player. These
define the ``state'' that is used to make the decisions in a plan.
The three information models considered here are as follows.
- [] Alternating play:
The players take turns playing, and all players know all actions that
have been previously applied. This is the situation obtained, for
example, in a game of chess. To define a plan, let and
denote the set of all vertices from which
and
must make a
decision, respectively. In Figure 10.13, is the set
of dark vertices and is the set of white vertices. Let
and be the action spaces for
and
, respectively,
which depend on the vertex. A (deterministic)
plan for
is defined as a function,
, on that yields an action
for each
. Similarly, a (deterministic) plan for
is
defined as a function, , on that yields an action
for each
. For the randomized case, let
and denote the sets of all probability distributions
over and , respectively. A randomized plan for
is defined as a function that yields
some
for each
. Likewise, a randomized plan for
is defined as a function that maps from
into .
- [] Stage-by-stage: Each
player knows the actions applied by the other in all previous stages;
however, there is no information about actions chosen by others in the
current stage. This effectively means that both players act
simultaneously in each stage. In this case, a deterministic or
randomized plan for
is defined as in the alternating play case;
however, plans for
are defined as functions on , instead of
. This is because at the time it makes its decision,
has
available precisely the same information as
. The action spaces
for
must conform to be dependent on elements of , instead
of ; otherwise,
would not know what actions are available.
Therefore, they are defined as for each
.
- [] Open loop: Each player has
no knowledge of the previous actions of the other. They only know how
many actions have been applied so far, which indicates the stage of
the game. Plans are defined as functions on , the set of
stages, because the particular vertex is not known. Note that an
open-loop plan is just a sequence of actions in the deterministic case
(as in Section 2.3) and a sequence of probability
distributions in the randomized case. Again, the action spaces must
conform to the information. Thus, they are and for each
.
For a single-stage game, as in Figure 10.12, the
stage-by-stage and open-loop models are equivalent.
Subsections
Steven M LaValle
2020-08-14