Let us now turn to the final column of our table, considering the progress that has been made in avoiding rework in Boolean satisfiability engines. The basic idea is to avoid situations where a conventional implementation will ``thrash'', solving the same subproblem many times in different contexts.
To understand the source of the difficulty, consider an example
involving a SAT problem with variables
, and suppose also that the subset of
involving
only variables
in fact implies that
must be true.
Now imagine that we have constructed a partial solution that values
the variables
, and that we initially set
to false. After some amount of backtracking, we realize that
must be true. Unfortunately, if we subsequently change the
value of one of the
's for
, we will ``forget'' that
needs to be true and are in danger of setting it to false
once again, followed by a repetition of the search that showed
to be a consequence of
. Indeed, we are in danger of
solving this subproblem not just twice, but once for each search node
that we examine in the space of
's for
.
As we have indicated, the solution to this problem is to cache the
fact that needs to be true; this information is generally
saved in the form of a new clause called a nogood. In this
case, we record the unary clause
. Modifying our original
problem
in this way will allow us to immediately prune any
subproblem for which we have set
to false. This technique
was introduced by Stallman and Sussman Sussman:ddb in dependency directed backtracking.
Learning new constraints in this way can prevent thrashing. When a contradiction is encountered, the set of assignments that caused the contradiction is identified; we will call this set the conflict set. A new constraint can then be constructed that excludes the assignments in the conflict set. Adding this constraint to the problem will ensure that the faulty set of assignments is avoided in the future.
This description of learning is fairly syntactic; we can also give a
more semantic description. Suppose that our partial assignment
contains
and that our problem contains the
clauses
This is a general phenomenon. At any point, suppose that is a
variable that has been set in the partial assignment
. If
's
value is the result of a branch choice, there is no associated reason.
If
's current value is the result of a unit propagation, however, we associate
to
as a reason the clause that produced the propagation. If
's
value is the result of a backtrack, that value must be the result of a
contradiction identified for some subsequent variable
and we set
the reason for
to be the result of resolving the reasons for
and
. At any point, any variable whose value has been forced
will have an associated reason, and these accumulated reasons will
avoid the need to reexamine any particular portion of the search
space.
Modifying DPLL to exploit the derived information requires that we
include the derived clauses in the overall problem , thus enabling
new unit propagations and restricting subsequent search. But there is
an implicit change as well.
In our earlier example, suppose that we have set the variables
in that order, and that we have learned the nogood
While attractive in theory, however, this technique is difficult to
apply in practice. The reason is that a new nogood is learned with
every backtrack; since the number of backtracks is proportional to the
running time of the program, an exponential number of nogoods can be
learned. This can be expected both to overtax the memory available on
the system where the algorithm is running, and to increase the time
for each node expansion. As the number of clauses containing any
particular variable grows without bound, the unit propagation procedure
2.3 will grind to a halt.
Addressing this problem was the primary focus of work on systematic SAT solvers during the 1990's. Since it was impractical to retain all of the nogoods that had been learned, some method needed to be found that would allow a polynomial number of nogoods to be retained while others were discarded. The hope was that a relatively small number of nogoods could still be used to prune the search space effectively.6