9.3.2 Deterministic Strategies

What constitutes a good solution to Formulation 9.7? Consider the game from the perspective of ${{\rm P}_1}$ . It seems reasonable to apply worst-case analysis when trying to account for the action that will be taken by ${{\rm P}_2}$ . This results in a choice that is equivalent to assuming that ${{\rm P}_2}$ is nature acting under the nondeterministic model, as considered in Section 9.2.2. For a matrix game, this is computed by first determining the maximum cost over each row. Selecting the action that produces the minimum among these represents the lowest cost that ${{\rm P}_1}$ can guarantee for itself. Let this selection be referred to as a security strategy for ${{\rm P}_1}$ .

For the matrix game in (9.42), the security strategy is illustrated as

$\displaystyle \begin{tabular}{cc} & $V$ $U$ & \begin{tabular}{\vert c c c ... ...-2 & 2 & 0 & 1 & $\rightarrow {\bf 2}$ \hline \end{tabular} \end{tabular} ,$

(9.43)

in which

and

are the best actions. Each yields a cost no worse than

, regardless of the action chosen by ${{\rm P}_2}$ .

This can be formalized using the existing notation. A security strategy, $u^{*}$ , for ${{\rm P}_1}$ is defined in general as

$\displaystyle u^{*} = \operatornamewithlimits{argmin}_{u \in U} \Big\{ \max_{v \in V} \Big\{ L(u,v) \Big\} \Big\} .$

(9.44)

There may be multiple security strategies that satisfy the $\operatornamewithlimits{argmin}$ ; however, this does not cause trouble, as will be explained shortly. Let the resulting worst-case cost be denoted by $\overline{L}^*$ , and let it be called the upper value of the game. This is defined as

$\displaystyle \overline{L}^*= \max_{v \in V} \Big\{ L(u^{*},v) \Big\}.$

(9.45)

Now swap roles, and consider the game from the perspective of ${{\rm P}_2}$ , which would like to maximize . It can also use worst-case analysis, which means that it would like to select an action that guarantees a high cost, in spite of the action of ${{\rm P}_1}$ to potentially reduce it. A security strategy, $v^{*}$ , for ${{\rm P}_2}$ is defined as

$\displaystyle v^{*} = \operatornamewithlimits{argmax}_{v \in V} \Big\{ \min_{u \in U} \Big\{ L(u,v) \Big\} \Big\} .$

(9.46)

Note the symmetry with respect to (9.44). There may be multiple security strategies for ${{\rm P}_2}$ . A security strategy $v^{*}$ is just an ``upside-down'' version of the worst-case analysis applied in Section 9.2.2. The lower value, $\underline{L}^*$ , is defined as

$\displaystyle \underline{L}^*= \min_{u \in U} \Big\{ L(u,v^{*}) \Big\} .$

(9.47)

Returning to the matrix game in (9.42), the last column is selected by applying (9.46):

$\displaystyle \begin{tabular}{cc} & $V$\ \\ $U$\ & \begin{tabular}{\vert c\vert... ...row$\ & $\downarrow$\ \\ -2 & -1 & 0 & {\bf 1} \\ \end{tabular} \end{tabular} .$

(9.48)

An interesting relationship between the upper and lower values is that $\underline{L}^*\leq \overline{L}^*$ for any game using Formulation 9.7. This is shown by observing that

$\displaystyle \underline{L}^*= \min_{u \in U} \Big\{ L(u,v^{*}) \Big\} \leq L(u^{*},v^{*}) \leq \max_{v \in V} \Big\{ L(u^{*},v) \Big\} = \overline{L}^*,$

(9.49)

in which $L(u^{*},v^{*})$ is the cost received when the players apply their respective security strategies. If the game is played by rational DMs, then the resulting cost always lies between $\underline{L}^*$ and $\overline{L}^*$ .

Subsections

- Regret
- Saddle points

Steven M LaValle 2020-08-14