10.5.2 Sequential Games on State Spaces

An apparent problem in the previous section is that the number of vertices grows exponentially in the number of stages. In some games, however, there may be multiple action sequences that lead to the same state. This is true of many popular games, such as chess, checkers, and tic-tac-toe. In this case, it is convenient to define a state space that captures the complete set of unique game configurations. The player actions then transform the state. If there are different action sequences that lead to the same state, then separate vertices are not needed. This converts the game tree into a game graph by declaring vertices that represent the same state to be equivalent. The game graph is similar in many ways to the transition graphs discussed in Section 10.1, in the sequential game against nature. The same idea can be applied when there are opposing players.

We will arrive at a sequential game that is played over a state space by collapsing the game tree into a game graph. We will also allow the more general case of costs occurring on any transition edges, as opposed to only the leaves of the original game tree. Only the stage-by-stage model from the game tree is generalized here. Generalizations that use other information models are considered in Section 11.7. In the formulation that follows, $ {{\rm P}_2}$ can be can viewed as the replacement for nature in Formulation 10.1. The new formulation is still a generalization of Formulation 9.7, which was a single-stage, zero-sum game. To keep the concepts simpler, all spaces are assumed to be finite. The formulation is as follows.

Formulation 10..4 (Sequential Zero-Sum Game on a State Space)  
  1. Two players, $ {{\rm P}_1}$ and $ {{\rm P}_2}$.
  2. A finite, nonempty state space $ X$.
  3. For each state $ x \in X$, a finite, nonempty action space $ U(x)$ for $ {{\rm P}_1}$.
  4. For each state $ x \in X$, a finite, nonempty action space $ V(x)$ for $ {{\rm P}_2}$. To allow an extension of the alternating play model from Section 10.5.1, $ V(x,u)$ could alternatively be defined, to enable the set of actions available to $ {{\rm P}_2}$ to depend on the action $ u \in U$ of $ {{\rm P}_1}$.
  5. A state transition function $ f$ that produces a state, $ f(x,u,v)$, for every $ x \in X$, $ u \in U(x)$, and $ v \in V(x)$.
  6. A set $ {\cal K}$ of $ K$ stages, each denoted by $ k$, which begins at $ k=1$ and ends at $ k = K$. Let $ F = K+1$, which is the final stage, after the last action is applied.
  7. An initial state $ {x_{I}}\in X$. For some problems, this may not be specified, in which case a solution must be found from all initial states.
  8. A stage-additive cost functional $ L$. Let $ {\tilde{v}}_K$ denote the history of $ {{\rm P}_2}$'s actions up to stage $ K$. The cost functional may be applied to any combination of state and action histories to yield

    $\displaystyle L({\tilde{x}}_F,{\tilde{u}}_K,{\tilde{v}}_K) = \sum_{k=1}^K l(x_k,u_k,v_k) + l_F(x_F) .$ (10.106)

It will be assumed that both players always know the current state. Note that there are no termination actions in the formulation. The game terminates after each player has acted $ K$ times. There is also no direct formulation of a goal set. Both termination actions and goal sets can be added to the formulation without difficulty, but this is not considered here. The action sets can easily be extended to allow a dependency on the stage, to yield $ U(x,k)$ and $ V(x,k)$. The methods presented in this section can be adapted without trouble. This is avoided, however, to make the notation simpler.



Subsections
Steven M LaValle 2020-08-14