Figure 2.4 gives a general template of search algorithms, expressed using the state-space representation. At any point during the search, there will be three kinds of states:
The set of alive states is stored in a priority queue, , for
which a priority function must be specified. The only significant
difference between various search algorithms is the particular
function used to sort
. Many variations will be described
later, but for the time being, it might be helpful to pick one.
Therefore, assume for now that
is a common FIFO (First-In
First-Out) queue; whichever state has been waiting the longest will be
chosen when
is called. The rest of the general
search algorithm is quite simple. Initially,
contains the
initial state
. A while loop is then executed, which
terminates only when
is empty. This will only occur when the
entire graph has been explored without finding any goal states, which
results in a FAILURE (unless the reachable portion of
is infinite,
in which case the algorithm should never terminate). In each while iteration, the highest ranked element,
, of
is
removed. If
lies in
, then it reports SUCCESS and
terminates; otherwise, the algorithm tries applying every possible
action,
. For each next state,
, it must
determine whether
is being encountered for the first time. If it
is unvisited, then it is inserted into
; otherwise, there is
no need to consider it because it must be either dead or already in
.
The algorithm description in Figure 2.4 omits several
details that often become important in practice. For example, how
efficient is the test to determine whether
in line 4?
This depends, of course, on the size of the state space and on the
particular representations chosen for
and
. At this
level, we do not specify a particular method because the
representations are not given.
One important detail is that the existing algorithm only indicates
whether a solution exists, but does not seem to produce a plan, which
is a sequence of actions that achieves the goal. This can be fixed by
inserting a line after line 7 that associates with its parent,
. If this is performed each time, one can simply trace the
pointers from the final state to the initial state to recover the
plan. For convenience, one might also store which action was taken,
in addition to the pointer from
to
.
Lines 8 and 9 are conceptually simple, but how can one tell whether
has been visited? For some problems the state transition graph
might actually be a tree, which means that there are no repeated
states. Although this does not occur frequently, it is wonderful when
it does because there is no need to check whether states have been
visited. If the states in
all lie on a grid, one can simply make
a lookup table that can be accessed in constant time to determine
whether a state has been visited. In general, however, it might be
quite difficult because the state
must be compared with every
other state in
and with all of the dead states. If the representation of each state is long, as is sometimes
the case, this will be very costly. A good hashing scheme or another
clever data structure can greatly alleviate this cost, but in many
applications the computation time will remain high. One alternative
is to simply allow repeated states, but this could lead to an increase
in computational cost that far outweighs the benefits. Even if the
graph is very small, search algorithms could run in time exponential
in the size of the state transition graph, or the search may not
terminate at all, even if the graph is finite.
One final detail is that some search algorithms will require a cost to
be computed and associated with every state. If the same state is
reached multiple times, the cost may have to be updated, which is
performed in line 12, if the particular search algorithm requires it.
Such costs may be used in some way to sort the priority queue, or they
may enable the recovery of the plan on completion of the algorithm.
Instead of storing pointers, as mentioned previously, the optimal cost
to return to the initial state could be stored with each state. This
cost alone is sufficient to determine the action sequence that leads
to any visited state. Starting at a visited state, the path back to
can be obtained by traversing the state transition graph
backward in a way that decreases the cost as quickly as possible in
each step. For this to succeed, the costs must have a certain
monotonicity property, which is obtained by Dijkstra's algorithm and
search, and will be introduced in Section 2.2.2.
More generally, the costs must form a navigation function, which
is considered in Section 8.2.2 as feedback is incorporated
into discrete planning.
Steven M LaValle 2020-08-14