Recall that any problem specified using Formulation 11.1 can be converted using derived I-states into a problem under Formulation 10.1. By building on the discussion from the end of Section 11.1.3, this can be achieved by treating the I-space as a big state space in which each state is an I-state in the original problem formulation. Some of the components were given previously, but here a complete formulation is given.
Suppose that a problem has been specified using Formulation
11.1, resulting in the usual components: ,
,
,
,
,
,
,
, and
. The following concepts will
work for any sufficient I-map; however, the presentation will be
limited to two important cases:
and
, which
yield derived I-spaces
and
, respectively (recall
Sections 11.2.2 and 11.2.3).
![]() |
The components of Formulation 10.1 will now be specified
using components of the original problem. To avoid confusion between
the two formulations, an arrow will be placed above all components of
the new formulation. Figure 12.1 summarizes the coming
definitions. The new state space, , is defined as
, and a state,
, is a derived I-state,
. Under nondeterministic uncertainty,
means
, in which
is the history I-state. Under
probabilistic uncertainty,
means
. The
action space remains the same:
.
The strangest part of the formulation is the new nature action space,
. The observations in Formulation
11.1 behave very much like nature actions because they are
not selected by the robot, and, as will be seen shortly, they are the
only unpredictable part of the new state transition equation.
Therefore,
, the original
observation space. A new nature action,
, is
just an observation,
. The set
generally depends on
and
because some observations may be impossible to receive from some
states. For example, if a sensor that measures a mobile robot
position is never wrong by more than
meter, then observations that
are further than
meter from the true robot position are
impossible.
A derived state transition equation is defined with
and yields a new state,
. Using the original notation, this is just a function
that uses
,
, and
to compute the next
derived I-state,
, which is allowed because we are
working with sufficient I-maps, as described in Section
11.2.1.
Initial states and goal sets are optional and can be easily
formulated in the new representation. The initial I-state, ,
becomes the new initial state,
. It is assumed that
is either a subset of
or a probability distribution,
depending on whether planning occurs in
or
. In
the nondeterministic case, the new goal set
can be derived
as
![]() |
(12.1) |
The only remaining portion of Formulation 10.1 is the cost
functional. We will develop a cost model that uses only the state and
action histories. A dependency on nature would imply that the costs
depend directly on the observation,
, which was not
assumed in Formulation 11.1. The general
-stage cost
functional from Formulation 10.1 appears in this context as
The cost functional must be derived from the cost
functional
of the original problem. This is expressed in terms
of states, which are unknown. First consider the case of
.
The state
at stage
follows the probability distribution
, as derived in Section 11.2.3. Using
, an expected cost is assigned as
Ideally, we would like to make analogous expressions for the case of
; however, there is one problem. Formulating the worst-case
cost for each stage is too pessimistic. For example, it may be
possible to obtain high costs in two consecutive stages, but each of
these may correspond to following different paths in
. There is
nothing to constrain the worst-case analysis to the same path.
In the probabilistic case there is no problem because probabilities
can be assigned to paths. For the nondeterministic case, a cost
functional can be defined, but the stage-additive property needed for
dynamic programming is destroyed in general. Under some restrictions
on allowable costs, the stage-additive property is preserved.
The state at stage
is known to lie in
, as
derived in Section 11.2.2. For every history I-state,
, and
, assume that
is invariant
over all
. In this case,
A plan on the derived I-space,
or
, can now also
be considered as a plan on the new state space
. Thus, state
feedback is now possible, but in a larger state space
instead
of
. The outcomes of actions are still generally unpredictable due
to the observations. An interesting special case occurs when there
are no observations. In this case, the I-state is predictable because
it is derived only from actions that are chosen by the robot. In this
case, the new formulation does not need nature actions, which reduces
it down to Formulation 2.3. Due to this, feedback is no
longer needed if the initial I-state is given. A plan can be
expressed once again as a sequence of actions. Even though the original states are not predictable, the future information
states are! This means that the state trajectory in the new state
space is completely predictable as well.
Steven M LaValle 2020-08-14