Nondeterministic case

Suppose that the nondeterministic model of nature is used. A dynamic programming recurrence, (10.39), will be derived. This directly yields an iterative approach that computes a plan that minimizes the worst-case cost. The following presentation shadows that of Section 2.3.1; therefore, it may be helpful to refer back to this periodically.

An optimal plan $ \pi^*$ will be found by computing optimal cost-to-go functions. For $ 1 \leq k \leq F$, let $ G^*_k$ denote the worst-case cost that could accumulate from stage $ k$ to $ F$ under the execution of the optimal plan (compare to (2.5))

$\displaystyle G^*_k(x_k) = \min_{u_k} \max_{\theta_k} \min_{u_{k+1}} \max_{\the...
...max_{\theta_K} \left\{ \sum_{i=k}^{K} l(x_i,u_i,\theta_i) + l_F(x_F) \right\} .$ (10.33)

Inside of the $ \min$'s and $ \max$'s of (10.33) are the last $ F-k$ terms of the cost functional, (10.8). For simplicity, the ranges of each $ u_i$ and $ \theta _i$ in the $ \min$'s and $ \max$'s of (10.33) have not been indicated. The optimal cost-to-go for $ k = F$ is

$\displaystyle G^*_F(x_F) = l_F(x_F) ,$ (10.34)

which is the same as (2.6) for the predictable case.

Now consider making $ K$ passes over $ X$, each time computing $ G^*_k$ from $ G^*_{k+1}$, as $ k$ ranges from $ F$ down to $ 1$. In the first iteration, $ G^*_F$ is copied from $ l_F$. In the second iteration, $ G^*_K$ is computed for each $ x_K \in X$ as (compare to (2.7))

$\displaystyle G^*_K(x_K) = \min_{u_K} \max_{\theta_K} \Big\{ l(x_K,u_K,\theta_K) + l_F(x_F) \Big\},$ (10.35)

in which $ u_K \in U(x_K)$ and $ \theta_K \in \Theta(x_K,u_K)$. Since $ l_F = G^*_F$ and $ x_F = f(x_K,u_K,\theta_K)$, substitutions are made into (10.35) to obtain (compare to (2.8))

$\displaystyle G^*_K(x_K) = \min_{u_K} \max_{\theta_K} \Big\{ l(x_K,u_K,\theta_K) + G^*_F(f(x_K,u_K,\theta_K)) \Big\},$ (10.36)

which computes the costs of all optimal one-step plans from stage $ K$ to stage $ F = K+1$.

More generally, $ G^*_k$ can be computed once $ G^*_{k+1}$ is given. Carefully study (10.33), and note that it can be written as (compare to (2.9))

\begin{displaymath}\begin{split}G^*_k(x_k) = \min_{{u_k}} \max_{{\theta_k}} \Big...
...({x_i},{u_i},\theta_i) + l_F(\xKp1) \Bigg\} \Bigg\} \end{split}\end{displaymath} (10.37)

by pulling the first cost term out of the sum and by separating the minimization over $ {u_k}$ from the rest, which range from $ u_{k+1}$ to $ u_K$. The second $ \min$ and $ \max$ do not affect the $ l({x_k},{u_k},{\theta_k})$ term; thus, $ l({x_k},{u_k},{\theta_k})$ can be pulled outside to obtain (compare to (2.10))

\begin{displaymath}\begin{split}G^*_k({x_k}) = \min_{{u_k}} \max_{{\theta_k}} \B...
...({x_i},{u_i},\theta_i) + l(\xKp1) \Bigg\} \Bigg\} . \end{split}\end{displaymath} (10.38)

The inner $ \min$'s and $ \max$'s represent $ G^*_{k+1}$, which yields the recurrence (compare to (2.11))

$\displaystyle G^*_k({x_k}) = \min_{{u_k}\in U({x_k})} \Big\{ \max_{{\theta_k}} \Big\{ l({x_k},{u_k},{\theta_k}) + G^*_{k+1}(\xkp1) \Big\} \Big\} .$ (10.39)

Steven M LaValle 2020-08-14