The notions of feasible and optimal plans need to be reconsidered in
the context of feedback planning because the initial condition is not
given.  A plan  is called a solution to the feasible
planning problem if from every
 is called a solution to the feasible
planning problem if from every  from which
 from which  is
reachable the goal set is indeed reached by executing
 is
reachable the goal set is indeed reached by executing  from
 from
 .  This means that the cost functional is ignored (an alternative
to Formulation 8.1 can be defined in which the cost
functional is removed).  For convenience,
.  This means that the cost functional is ignored (an alternative
to Formulation 8.1 can be defined in which the cost
functional is removed).  For convenience,  will be called a
feasible feedback plan.
 will be called a
feasible feedback plan.
Now consider optimality.  From a given state  , it is clear that an
optimal plan exists using the concepts of Section 2.3.
Is it possible that a different optimal plan needs to be associated
with every
, it is clear that an
optimal plan exists using the concepts of Section 2.3.
Is it possible that a different optimal plan needs to be associated
with every  that can reach
 that can reach  ?  It turns out that only
one plan is needed to encode optimal paths from every initial state to
?  It turns out that only
one plan is needed to encode optimal paths from every initial state to
 .  Why is this true?  Suppose that the optimal
cost-to-go is computed over
.  Why is this true?  Suppose that the optimal
cost-to-go is computed over  using Dijkstra's algorithm or
value iteration, as covered in Section 2.3.  Every
cost-to-go value at some
 using Dijkstra's algorithm or
value iteration, as covered in Section 2.3.  Every
cost-to-go value at some  indicates the cost received under
the implementation of the optimal open-loop plan from
 indicates the cost received under
the implementation of the optimal open-loop plan from  .  The first
step in this optimal plan can be determined by (2.19),
which yields a new state
.  The first
step in this optimal plan can be determined by (2.19),
which yields a new state 
 .  From
.  From  , (2.19)
can be applied once again to determine the next optimal action.  The
cost at
, (2.19)
can be applied once again to determine the next optimal action.  The
cost at  represents both the optimal cost-to-go if
 represents both the optimal cost-to-go if  is the
initial condition and also the optimal cost-to-go when continuing on
the optimal path from
 is the
initial condition and also the optimal cost-to-go when continuing on
the optimal path from  .  The two must be equivalent because of the
dynamic programming principle.  Since all such costs must coincide, a
single feedback plan can be used to obtain the optimal cost-to-go from
every initial condition.
.  The two must be equivalent because of the
dynamic programming principle.  Since all such costs must coincide, a
single feedback plan can be used to obtain the optimal cost-to-go from
every initial condition.
A feedback plan  is therefore defined as optimal if from
every
 is therefore defined as optimal if from
every  , the total cost,
, the total cost,  , obtained by
executing
, obtained by
executing  is the lowest among all possible plans.  The
requirement that this holds for every initial condition is important
for feedback planning.
 is the lowest among all possible plans.  The
requirement that this holds for every initial condition is important
for feedback planning.
 ), down (
), down (
 ), left
(
), left
(
 ), right (
), right (
 ), and terminate (
), and terminate ( ); some
directions are not available from some states.  A solution feedback
plan is depicted in Figure 8.2.  Many other possible
solutions plans exist.  The one shown here happens to be optimal in
terms of the number of steps to the goal.  Some alternative feedback
plans are also optimal (figure out which arrows can be changed).  To
apply the plan from any initial state, simply follow the arrows to the
goal.  In each stage, the application of the action represented by the
arrow leads to the next state.  The process terminates when
); some
directions are not available from some states.  A solution feedback
plan is depicted in Figure 8.2.  Many other possible
solutions plans exist.  The one shown here happens to be optimal in
terms of the number of steps to the goal.  Some alternative feedback
plans are also optimal (figure out which arrows can be changed).  To
apply the plan from any initial state, simply follow the arrows to the
goal.  In each stage, the application of the action represented by the
arrow leads to the next state.  The process terminates when  is
applied at the goal.
 is
applied at the goal.  
 
 
Steven M LaValle 2020-08-14