9.3.1 Game Formulation

Suppose there are two *players*,
and
, that each have
to make a decision. Each has a finite set of actions, and ,
respectively. The set can be viewed as the ``replacement'' of
from Formulation 9.3 by a set of actions chosen
by a true opponent. Each player has a cost function, which is denoted
as
for . An important
constraint for zero-sum games is

which means that a cost for one player is a reward for the other. This is the basis of the term

In light of (9.41) it is pointless to represent two cost
functions. Instead, the superscript will be dropped, and will
refer to the cost, , of
. The goal of
is to minimize
. Due to (9.41), the goal of
is to maximize .
Thus, can be considered as a *reward* for
, but a *cost* for
.

A formulation can now be given:

- Two players, and .
- A nonempty, finite set called the
*action space for*. For convenience in describing examples, assume that is a set of consecutive integers from to . Each is referred to as an*action of*. - A nonempty, finite set called the
*action space for*. Assume that is a set of consecutive integers from to . Each is referred to as an*action of*. - A function
called the
*cost function*for . This also serves as a*reward function*for because of (9.41).

Before discussing what it means to solve a zero-sum game, some
additional assumptions are needed. Assume that the players know each
other's cost functions. This implies that the motivation of the
opponent is completely understood. The other assumption is that the
players are *rational*, which means
that they will try to obtain the best cost whenever possible.
will not choose an action that leads to higher cost when a lower cost
action is available. Likewise,
will not choose an action that
leads to lower cost. Finally, it is assumed that both players make
their decisions simultaneously. There is no information regarding the
decision of
that can be exploited by
, and vice versa.

Formulation 9.7 is often referred to as a *matrix
game* because can be expressed with a cost matrix, as was done in
Section 9.2. Here the matrix indicates costs for
and
, instead of the robot and nature. All of the required
information from Formulation 9.7 is specified by a
single matrix; therefore, it is a convenient form for expressing
zero-sum games.

in which each row corresponds to some , and each column corresponds to some . Each entry yields , which is the cost for . This representation is similar to that shown in Example 9.8, except that the nature action space, , is replaced by . The cost for is .

Steven M LaValle 2020-08-14