Imagine exploring an unknown planet using a robotic vehicle. The robot moves along the rugged terrain while using a range scanner to make precise measurements of the ground in its vicinity. As the robot moves, it may discover that some parts were easier to traverse than it originally thought. In other cases, it might realize that some direction it was intending to go is impassable due to a large bolder or a ravine. If the goal is to arrive at some specified coordinates, this problem can be viewed as a navigation problem in an unknown environment. The resulting solution is a lazy approach, as discussed in Section 12.2.1.
![]() |
This section presents Stentz's algorithm [913], which has
been used in many outdoor vehicle navigation applications, such as the
vehicle shown in Figure 12.17. The algorithm can be considered
as a dynamic version of the backward variant of Dijkstra's
algorithm. Thus, it maintains cost-to-go values, and the search
grows outward from the goal, as opposed to cost-to-come values from
in the version of Dijkstra's algorithm in Section
2.3.3. The method applies to any optimal planning problem.
In terms of the state transition graph, it is assumed that the costs
of edge transitions are unknown (equivalently, each cost
is
unknown). In the navigation problem, a positive cost indicates the
difficulty of traveling from state
to state
.
To work with a concrete problem, imagine that a planet surface is
partitioned into a high-resolution grid. The state space is simply a
bounded set of grid tiles; hence,
. Each
grid tile is assigned a positive, real value,
, that indicates
the difficulty of its traversal. The actions
at each grid
point can be chosen using standard grid neighbors (e.g.,
four-neighbors or eight-neighbors). This now defines a state
transition graph over
. From any
and
such that
, the cost term is assigned using
as
. This model is a generalization of the grid in
Section 12.3.1, in which the tiles were either empty or
occupied; here any positive real value is allowed. In the coming
explanation, the costs may be more general than what is permitted by
starting from
, and the state transition graph does not need to
be derived from a grid. Some initial values are assigned arbitrarily
for all
. For example, in the planetary exploration
application, the cost of traversing a level, unobstructed surface may
be uniformly assumed.
The task is to navigate to some goal state, . The method
works by initially constructing a feedback plan,
, on a subset
of
that includes both
and
. The plan,
,
is computed by iteratively applying the procedure in Figure
12.18 until the optimal cost-to-go is known at
. A
priority queue,
, is maintained as in Dijkstra's algorithm;
however, Stentz's algorithm allows the costs of elements in
to be
modified due to information sensed during execution. Let
denote the lowest cost-to-go associated with
during the time it
spends in
. Assume that
is sorted according to
. Let
denote its current cost-to-go value, which may actually be
more than
if some cost updates caused it to increase.
Suppose that some
can be applied to reach a state
. Let
denote the cost-to-go from
by
traveling via
,
![]() |
(12.24) |
After the iterations of Figure 12.18 finish, the robot
executes , which generates a sequence of visited states. Let
denote the current state during execution. If it is discovered
that if
would be applied, the received cost would not
match the cost
in the current model, then the costs need
to be updated. More generally, the robot may have to be able to update
costs within a region around
that corresponds to the sensor
field of view. For the description below, assume that an update,
, is obtained for
only (the more general case is
handled similarly). First,
is updated to the newly
measured value. If
happened to be dead (visited, but no longer
in
), then it is inserted again into
, with cost
.
The steps in Figure 12.18 are performed until
for all
. Following this, the plan execution
continues until either the goal is reached or another cost mismatch
is discovered. At any time during execution, the robot motions are
optimal given the current information about the costs [913].
Figure 12.19 illustrates the execution of the algorithm. Figure 12.19a shows a synthetic terrain that was generated by a stochastic fractal. Darker gray values indicate higher cost. In the center, very costly terrain acts as a barrier, for which an escape route exists in the downward direction. The initial state is the middle of the left edge of the environment, and the goal state is the right edge. The robot initially plans a straight-line path and then incrementally updates the path in each step as it moves. In Figure 12.19b, the robot has encountered the costly center and begins to search for a way around. Finally, the goal is reached, as shown in Figure 12.19c. The executed path is actually the result of executing a series of optimal paths, each of which is based on the known information at the time a single action is applied.