Nondeterministic case

Suppose that the nondeterministic model of nature is used. A dynamic programming recurrence, (10.39), will be derived. This directly yields an iterative approach that computes a plan that minimizes the worst-case cost. The following presentation shadows that of Section 2.3.1; therefore, it may be helpful to refer back to this periodically.

An optimal plan $\pi^*$ will be found by computing optimal cost-to-go functions. For $1 \leq k \leq F$ , let denote the worst-case cost that could accumulate from stage to under the execution of the optimal plan (compare to (2.5))

$\displaystyle G^*_k(x_k) = \min_{u_k} \max_{\theta_k} \min_{u_{k+1}} \max_{\the... ...max_{\theta_K} \left\{ \sum_{i=k}^{K} l(x_i,u_i,\theta_i) + l_F(x_F) \right\} .$

(10.33)

Inside of the $\min$ 's and $\max$ 's of (10.33) are the last

terms of the cost functional, (10.8). For simplicity, the ranges of each

and $\theta _i$ in the $\min$ 's and $\max$ 's of (10.33) have not been indicated. The optimal cost-to-go for

$\displaystyle G^*_F(x_F) = l_F(x_F) ,$

(10.34)

which is the same as (2.6) for the predictable case.

Now consider making passes over , each time computing from $G^*_{k+1}$ , as ranges from down to . In the first iteration, is copied from . In the second iteration, is computed for each $x_K \in X$ as (compare to (2.7))

$\displaystyle G^*_K(x_K) = \min_{u_K} \max_{\theta_K} \Big\{ l(x_K,u_K,\theta_K) + l_F(x_F) \Big\},$

(10.35)

in which $u_K \in U(x_K)$ and $\theta_K \in \Theta(x_K,u_K)$ . Since

and $x_F = f(x_K,u_K,\theta_K)$ , substitutions are made into (10.35) to obtain (compare to (2.8))

$\displaystyle G^*_K(x_K) = \min_{u_K} \max_{\theta_K} \Big\{ l(x_K,u_K,\theta_K) + G^*_F(f(x_K,u_K,\theta_K)) \Big\},$

(10.36)

which computes the costs of all optimal one-step plans from stage

to stage

More generally, can be computed once $G^*_{k+1}$ is given. Carefully study (10.33), and note that it can be written as (compare to (2.9))

$\begin{displaymath}\begin{split}G^*_k(x_k) = \min_{{u_k}} \max_{{\theta_k}} \Big... ...({x_i},{u_i},\theta_i) + l_F(\xKp1) \Bigg\} \Bigg\} \end{split}\end{displaymath}$

(10.37)

by pulling the first cost term out of the sum and by separating the minimization over ${u_k}$ from the rest, which range from $u_{k+1}$ to

. The second $\min$ and $\max$ do not affect the $l({x_k},{u_k},{\theta_k})$ term; thus, $l({x_k},{u_k},{\theta_k})$ can be pulled outside to obtain (compare to (2.10))

$\begin{displaymath}\begin{split}G^*_k({x_k}) = \min_{{u_k}} \max_{{\theta_k}} \B... ...({x_i},{u_i},\theta_i) + l(\xKp1) \Bigg\} \Bigg\} . \end{split}\end{displaymath}$

(10.38)

The inner $\min$ 's and $\max$ 's represent $G^*_{k+1}$ , which yields the recurrence (compare to (2.11))

$\displaystyle G^*_k({x_k}) = \min_{{u_k}\in U({x_k})} \Big\{ \max_{{\theta_k}} \Big\{ l({x_k},{u_k},{\theta_k}) + G^*_{k+1}(\xkp1) \Big\} \Big\} .$

(10.39)

Steven M LaValle 2020-08-14