Since a game under Formulation 9.7 can be nicely expressed as a matrix, it is tempting to use linear algebra to conveniently express expected costs. Let and . As in Section 9.1.3, a randomized strategy for can be represented as an -dimensional vector,
(9.55) |
Let denote the expected cost that will be received if plays and plays . This can be computed as
Let and denote the set of all randomized strategies for and , respectively. These spaces include strategies that are equivalent to the deterministic strategies considered in Section 9.3.2 by assigning probability one to a single action. Thus, and can be considered as expansions of the set of possible strategies in comparison to what was available in the deterministic setting. Using and , randomized security strategies for and are defined as
The randomized upper value is defined as
The most fundamental result in zero-sum game theory was shown by von Neumann [956,957], and it states that for any game in Formulation 9.7. This yields the randomized value for the game. This means that there will never be expected regret if the players stay with their security strategies. If the players apply their randomized security strategies, then a randomized saddle point is obtained. This saddle point cannot be seen as a simple pattern in the matrix because it instead exists over and .
The guaranteed existence of a randomized saddle point is an important result because it demonstrates the value of randomization when making decisions against an intelligent opponent. In Example 9.7, it was intuitively argued that randomization seems to help when playing against an intelligent adversary. When playing the game repeatedly with a deterministic strategy, the other player could learn the strategy and win every time. Once a randomized strategy is used, the players will not experience regret.
Steven M LaValle 2020-08-14