Suppose that the nondeterministic model of nature is used. A dynamic programming recurrence, (10.39), will be derived. This directly yields an iterative approach that computes a plan that minimizes the worst-case cost. The following presentation shadows that of Section 2.3.1; therefore, it may be helpful to refer back to this periodically.
An optimal plan will be found by computing optimal cost-to-go functions. For , let denote the worst-case cost that could accumulate from stage to under the execution of the optimal plan (compare to (2.5))
Now consider making passes over , each time computing from , as ranges from down to . In the first iteration, is copied from . In the second iteration, is computed for each as (compare to (2.7))
More generally, can be computed once is given. Carefully study (10.33), and note that it can be written as (compare to (2.9))
Steven M LaValle 2020-08-14