Up until now, any actions taken in a plan have been deterministic. The plans in Chapter 2 specified actions with complete certainty. Formulation 9.1 was solved by specifying the best action. It can be viewed as a strategy that trivially makes the same decision every time.
In some applications, the decision maker may not want to be predictable. To achieve this, randomization can be incorporated into the strategy. If is discrete, a randomized strategy, , is specified by a probability distribution, , over . Let denote the set of all possible randomized strategies. When the strategy is applied, an action is chosen by sampling according to the probability distribution, . We now have to make a clear distinction between defining the strategy and applying the strategy. So far, the two have been equivalent; however, a randomized strategy must be executed to determine the resulting action. If the strategy is executed repeatedly, it is assumed that each trial is independent of the actions obtained in previous trials. In other words, , in which represents the probability that the strategy chooses action in trial , given that was chosen in trial for some . If is continuous, then a randomized strategy may be specified by a probability density function, . In decision-theory and game-theory literature, deterministic and randomized strategies are often referred to as pure and mixed, respectively.
A deterministic strategy can always be viewed as a special case of a randomized strategy, if you are not bothered by events that have probability zero. A deterministic strategy, , can be simulated by a random strategy by assigning if , and otherwise. Only with probability zero can different actions be chosen (possible, but not probable!).
Imagine using a randomized strategy to solve a problem expressed using Formulation 9.1. The first difficulty appears to be that the cost cannot be predicted. If the strategy is applied numerous times, then we can define the average cost. As the number of times tends to infinity, this average would converge to the expected cost, denoted by , if is treated as a random variable (in addition to the cost function). If is discrete, the expected cost of a randomized strategy is
(9.13) |
An interesting question is whether there exists some such that , for all . In other words, do there exist randomized strategies that are better than all deterministic strategies, using Formulation 9.1? The answer is no because the best strategy is always to assign probability one to the action, , that minimizes . This is equivalent to using a deterministic strategy. If there are two or more actions that obtain the optimal cost, then a randomized strategy could arbitrarily distribute all of the probability mass between these. However, there would be no further reduction in cost. Therefore, randomization seems pointless in this context, unless there are other considerations.
One important example in which a randomized strategy is of critical importance is when making decisions in competition with an intelligent adversary. If the problem is repeated many times, an opponent could easily learn any deterministic strategy. Randomization can be used to weaken the prediction capabilities of an opponent. This idea will be used in Section 9.3 to obtain better ways to play zero-sum games.
Following is an example that illustrates the advantage of randomization when repeatedly playing against an intelligent opponent.
A generalization of this to three actions is the famous game of
Rock-Paper-Scissors [958]. If you want to design a computer
program that repeatedly plays this game against smart opponents, it
seems best to incorporate randomization.
Steven M LaValle 2020-08-14