The notions of feasible and optimal plans need to be reconsidered in the context of feedback planning because the initial condition is not given. A plan is called a solution to the feasible planning problem if from every from which is reachable the goal set is indeed reached by executing from . This means that the cost functional is ignored (an alternative to Formulation 8.1 can be defined in which the cost functional is removed). For convenience, will be called a feasible feedback plan.
Now consider optimality. From a given state , it is clear that an optimal plan exists using the concepts of Section 2.3. Is it possible that a different optimal plan needs to be associated with every that can reach ? It turns out that only one plan is needed to encode optimal paths from every initial state to . Why is this true? Suppose that the optimal cost-to-go is computed over using Dijkstra's algorithm or value iteration, as covered in Section 2.3. Every cost-to-go value at some indicates the cost received under the implementation of the optimal open-loop plan from . The first step in this optimal plan can be determined by (2.19), which yields a new state . From , (2.19) can be applied once again to determine the next optimal action. The cost at represents both the optimal cost-to-go if is the initial condition and also the optimal cost-to-go when continuing on the optimal path from . The two must be equivalent because of the dynamic programming principle. Since all such costs must coincide, a single feedback plan can be used to obtain the optimal cost-to-go from every initial condition.
A feedback plan is therefore defined as optimal if from every , the total cost, , obtained by executing is the lowest among all possible plans. The requirement that this holds for every initial condition is important for feedback planning.
Steven M LaValle 2020-08-14