Suppose that the nondeterministic model of nature is used. A dynamic programming recurrence, (10.39), will be derived. This directly yields an iterative approach that computes a plan that minimizes the worst-case cost. The following presentation shadows that of Section 2.3.1; therefore, it may be helpful to refer back to this periodically.
An optimal plan will be found by computing optimal
cost-to-go functions. For
, let
denote
the worst-case cost that could accumulate from stage
to
under
the execution of the optimal plan (compare to (2.5))
Now consider making passes over
, each time computing
from
, as
ranges from
down to
.
In the first iteration,
is copied from
. In the
second iteration,
is computed for each
as
(compare to (2.7))
More generally, can be computed once
is
given. Carefully study (10.33), and note that it can
be written as (compare to (2.9))
Steven M LaValle 2020-08-14