Consider a problem formulation that is identical to Formulation
8.1 except that is allowed to be continuous. Assume
that
is bounded, and assume for now that the action space,
,
it finite for all
. Backward value iteration can be applied.
The dynamic programming arguments and derivation are identical to
those in Section 2.3. The resulting recurrence is
identical to (2.11) and is repeated here for convenience:
![]() |
Suppose that a finite set
of samples is used to
represent cost-to-go functions over
. The evaluation of
(8.56) using interpolation is depicted in Figure
8.19. In general, the samples should be chosen to reduce
the dispersion (defined in Section
5.2.3) as much as possible. This prevents attempts to
approximate the cost-to-go function on large areas that contain no
sample points. The rate of convergence ultimately depends on the
dispersion [92] (in combination with Lipschitz conditions on
the state transition equation and the cost functional). To simplify
notation and some other issues, assume that
is a grid of regularly
spaced points in
.
First, consider the case in which
. Let
, in which
. For example, if
, then
. Note that this always yields points
on the boundary of
, which ensures that for any point in
there are samples both above and below it. Let
be the largest
integer such that
. This implies that
. The
samples
and
are called interpolation neighbors
of
.
The value of in (8.56) at any
can be obtained via linear interpolation as
The interpolation idea can be naturally extended to multiple
dimensions. Let be a bounded subset of
. Let
represent an
-dimensional grid of points in
. Each sample
in
is denoted by
. For some
,
there are
interpolation neighbors that ``surround'' it. These
are the corners of an
-dimensional cube that contains
. Let
. Let
denote the largest integer for which
the
th coordinate of
is less than
.
The
samples are all those for which either
or
appears in the expression
, for each
. This requires that samples exist in
for all
of these cases. Note that
may be a complicated subset of
,
provided that for any
, all of the required
interpolation neighbors are in
. Using the
interpolation
neighbors, the value of
in (8.56) on any
can be obtained via multi-linear interpolation. In the
case of
, this is expressed as
![]() |
Unfortunately, the number of interpolation neighbors grows
exponentially with the dimension, . Instead of using all
interpolation neighbors, one improvement is to decompose the cube
defined by the
samples into simplexes. Each simplex has only
samples as its vertices. Only the vertices of the simplex that
contains
are declared to be the interpolation neighbors of
;
this reduces the cost of evaluating
to
time.
The problem, however, is that determining the simplex that contains
may be a challenging point-location problem (a common
problem in computational geometry [264]). If
barycentric subdivision is used to decompose the cube using the
midpoints of all faces, then the point-location problem can be solved
in
time [263,607,721], which is an
improvement over the
scheme described above. Examples of
this decomposition are shown for two and three dimensions in Figure
8.20. This is sometimes called the
Coxeter-Freudenthal-Kuhn triangulation. Even though
is not
too large due to practical performance considerations (typically,
), substantial savings occur in implementations, even for
.
It will be convenient to refer directly to the set of all points in
for which all required interpolation neighbors exist. For any
finite set
of sample points, let the interpolation region
be the set of all
for which
can be computed by interpolation. This means that
if and only if all interpolation neighbors of
lie in
.
Figure 8.21a shows an example. Note that some sample
points may not contribute any points to
. If a grid of samples is
used to approximate
, then the volume of
approaches zero as the sampling resolution increases.
![]() |