2.5.3 Planning as Satisfiability

Another interesting approach is to convert the planning problem into an enormous Boolean satisfiability problem. This means that the planning problem of Formulation 2.4 can be solved by determining whether some assignment of variables is possible for a Boolean expression that leads to a TRUE value. Generic methods for determining satisfiability can be directly applied to the Boolean expression that encodes the planning problem. The Davis-Putnam procedure is one of the most widely known algorithms for satisfiability. It performs a depth-first search by iteratively trying assignments for variables and backtracking when assignments fail. During the search, large parts of the expression can be eliminated due to the current assignments. The algorithm is complete and reasonably efficient. Its use in solving planning problems is surveyed in [382]. In practice, stochastic local search methods provide a reasonable alternative to the Davis-Putnam procedure [459].

Suppose a planning problem has been given in terms of Formulation 2.4. All literals and operators will be tagged with a stage index. For example, a literal that appears in two different stages will be considered distinct. This kind of tagging is similar to situation calculus [378]; however, in that case, variables are allowed for the tags. To obtain a finite, Boolean expression the total number of stages must be declared. Let $ K$ denote the number of stages at which operators can be applied. As usual, the fist stage is $ k=1$ and the final stage is $ k = F = K+1$. Setting a stage limit is a significant drawback of the approach because this is usually not known before the problem is solved. A planning algorithm can assume a small value for $ F$ and then gradually increase it each time the resulting Boolean expression is not satisfied. If the problem is not solvable, however, this approach iterates forever.

Let $ \vee$ denote logical OR, and let $ \wedge$ denote logical AND. The Boolean expression is written as a conjunction2.5 of many terms, which arise from five different sources:

  1. Initial state: A conjunction of all literals in $ S$ is formed, along with the negation of all positive literals not in $ S$. These are all tagged with $ 1$, the initial stage index.
  2. Goal state: A conjunction of all literals in $ G$, tagged with the final stage index, $ F = K+1$.
  3. Operator encodings: Each operator must be copied over the stages. For each $ o \in O$, let $ o_k$ denote the operator applied at stage $ k$. A conjunction is formed over all operators at all stages. For each $ o_k$, the expression is

    $\displaystyle \neg o_k \vee \left( p_1 \wedge p_2 \wedge \cdots \wedge p_m \wedge e_1 \wedge e_2 \wedge \cdots \wedge e_n \right) ,$ (2.33)

    in which $ p_1$, $ \ldots $, $ p_m$ are the preconditions of $ o_k$, and $ e_1$, $ \ldots $, $ e_n$ are the effects of $ o_k$.
  4. Frame axioms: The next part is to encode the implicit assumption that every literal that is not an effect of the applied operator remains unchanged in the next stage. This can alternatively be stated as follows: If a literal $ l$ becomes negated to $ \neg l$, then an operator that includes $ \neg l$ as an effect must have been executed. (If $ l$ was already a negative literal, then $ \neg l$ is a positive literal.) For each stage and literal, an expression is needed. Suppose that $ l_k$ and $ l_{k+1}$ are the same literal but are tagged for different stages. The expression is

    $\displaystyle (l_k \vee \neg l_{k+1}) \vee (o_{k,1} \vee o_{k,2} \vee \cdots \vee o_{k,j}) ,$ (2.34)

    in which $ o_{k,1}$, $ \ldots $, $ o_{k,j}$ are the operators, tagged for stage $ k$, that contain $ l_{k+1}$ as an effect. This ensures that if $ \neg l_k$ appears, followed by $ l_{k+1}$, then some operator must have caused the change.
  5. Complete exclusion axiom: This indicates that only one operator applies at every stage. For every stage $ k$, and any pair of stage-tagged operators $ o_k$ and $ o'_k$, the expression is

    $\displaystyle \neg o_k \vee \neg o'_k ,$ (2.35)

    which is logically equivalent to $ \neg(o_k \wedge o'_k)$ (meaning, ``not both at the same stage'').
It is shown in [512] that a solution plan exists if and only if the resulting Boolean expression is satisfiable.

The following example illustrates the construction.

Example 2..9 (The Flashlight Problem as a Boolean Expression)   A Boolean expression will be constructed for Example 2.6. Each of the expressions given below is joined into one large expression by connecting them with $ \wedge$'s.

The expression for the initial state is

$\displaystyle O(C,F,1) \wedge \neg I(B1,F,1) \wedge \neg I(B2,F,1) ,$ (2.36)

which uses the abbreviated names, and the stage tag has been added as an argument to the predicates. The expression for the goal state is

$\displaystyle O(C,F,5) \wedge I(B1,F,5) \wedge I(B2,F,5) ,$ (2.37)

which indicates that the goal must be achieved at stage $ k=5$. This value was determined because we already know the solution plan from (2.24). The method will also work correctly for a larger value of $ k$. The expressions for the operators are

\begin{displaymath}\begin{split}\neg PC_k & \vee (\neg O(C,F,k) \wedge O(C,F,k+1...
... O(C,F,k) \wedge \neg I(B2,F,k) \wedge I(B2,F,k+1)) \end{split}\end{displaymath} (2.38)

for each $ k$ from $ 1$ to $ 4$.

The frame axioms yield the expressions

\begin{displaymath}\begin{split}(O(C,F,k) \vee \neg O(C,F,k+1)) & \vee (PC_k) \\...
...vee (I2_k)  (\neg I(B2,F,k) \vee I(B2,F,k+1)) & , \end{split}\end{displaymath} (2.39)

for each $ k$ from $ 1$ to $ 4$. No operators remove batteries from the flashlight. Hence, two of the expressions list no operators.

Finally, the complete exclusion axiom yields the expressions

$\displaystyle \neg RC_k$ $\displaystyle \vee \neg PC_k$ $\displaystyle \qquad \neg RC_k$ $\displaystyle \vee \neg O1_k$ $\displaystyle \qquad \neg RC_k$ $\displaystyle \vee \neg O2_k$ (2.40)
$\displaystyle \neg PC_k$ $\displaystyle \vee \neg O1_k$ $\displaystyle \qquad \neg PC_k$ $\displaystyle \vee \neg O2_k$ $\displaystyle \qquad \neg O1_k$ $\displaystyle \vee \neg O2_k ,$    

for each $ k$ from $ 1$ to $ 4$. The full problem is encoded by combining all of the given expressions into an enormous conjunction. The expression is satisfied by assigning TRUE values to $ RC_1$, $ IB1_2$, $ IB2_3$, and $ PC_4$. An alternative solution is $ RC_1$, $ IB2_2$, $ IB1_3$, and $ PC_4$. The stage index tags indicate the order that the actions are applied in the recovered plan. $ \blacksquare$

Steven M LaValle 2020-08-14