1 Introduction

In recent years the task of deciding whether or not a given CNF propositional logic formula is satisfiable has gone from a problem of theoretical interest to a practical approach for solving real-world problems. Satisfiability (SAT) procedures are now a standard tool for hardware verification, including verification of super-scalar processors (Velev and Bryant, 2001, Biere et al., 1999a). Open problems in group theory have been encoded and solved using satisfiability solvers (Zhang and Hsiang, 1994). Other applications of SAT include circuit diagnosis and experiment design (Gomes et al., 1998b, Konuk and Larrabee, 1993).

The most surprising aspect of such relatively recent practical progress is that the best complete satisfiability testing algorithms remain variants of the Davis-Putnam-Logemann-Loveland or DPLL procedure (Davis et al., 1962, Davis and Putnam, 1960) for backtrack search in the space of partial truth assignments. The key idea behind its efficacy is the pruning of search space based on falsified clauses. Since its introduction in the early 1960's, the main improvements to DPLL have been smart branch selection heuristics (e.g.,Li and Anbulagan, 1997), and extensions such as randomized restarts (Gomes et al., 1998a) and clause learning (see e.g.,Marques-Silva and Sakallah, 1996). One can argue that of these, clause learning has been the most significant in scaling DPLL to realistic problems. This paper attempts to understand the potential of clause learning and suggests ways to harness its power.

Clause learning grew out of work in AI on explanation-based learning (EBL), which sought to improve the performance of backtrack search algorithms by generating explanations for failure (backtrack) points, and then adding the explanations as new constraints on the original problem (Stallman and Sussman, 1977, de Kleer and Williams, 1987, Davis, 1984, Genesereth, 1984). For general constraint satisfaction problems the explanations are called ``conflicts'' or ``no goods''; in the case of Boolean CNF satisfiability, the technique becomes clause learning - the reason for failure is learned in the form of a ``conflict clause'' which is added to the set of given clauses. A series of researchers (Moskewicz et al., 2001, Zhang, 1997, Bayardo Jr., 1997, Marques-Silva and Sakallah, 1996, Zhang et al., 2001) showed that clause learning can be efficiently implemented and used to solve hard problems that cannot be approached by any other technique.

Despite its importance there has been little work on formal properties of clause learning, with the goal of understanding its fundamental strengths and limitations. A likely reason for such inattention is that clause learning is a rather complex rule of inference - in fact, as we describe below, a complex family of rules of inference. A contribution of this paper is a precise mathematical specification of various concepts used in describing clause learning.

Another problem in characterizing clause learning is defining a formal notion of the strength or power of a reasoning method. This paper uses the notion of proof complexity (Cook and Reckhow, 1977), which compares inference systems in terms of the sizes of the shortest proofs they sanction. We use CL to denote clause learning viewed as a proof system. A family of formulas C provides an exponential separation between systems S₁ and S₂ if the shortest proofs of formulas in C in system S₁ are exponentially smaller than the corresponding shortest proofs in S₂. From this basic propositional proof complexity point of view, only families of unsatisfiable formulas are of interest, because only proofs of unsatisfiability can be large; minimum proofs of satisfiability are linear in the number of variables of the formula. Nevertheless, Achlioptas et al. (2001) have shown how negative proof complexity results for unsatisfiable formulas can be used to derive time lower bounds for specific inference algorithms running on satisfiable formulas as well.

Proof complexity does not capture everything we intuitively mean by the power of a reasoning system, because it says nothing about how difficult it is to find shortest proofs. However, it is a good notion with which to begin our analysis, because the size of proofs provides a lower bound on the running time of any implementation of the system. In the systems we consider, a branching function, which determines which variable to split upon or which pair of clauses to resolve, guides the search. A negative proof complexity result for a system tells us that a family of formulas is intractable even with a perfect branching function; likewise, a positive result gives us hope of finding a branching function.

A basic result in proof complexity is that general resolution, denoted RES, is exponentially stronger than the DPLL procedure (Ben-Sasson, 2000, Bonet et al., 2000). This is because the trace of DPLL running on an unsatisfiable formula can be converted to a tree-like resolution proof of the same size, and tree-like proofs must sometimes be exponentially larger than the DAG-like proofs generated by RES. Although RES can yield shorter proofs, in practice DPLL is better because it provides a more efficient way to search for proofs. The weakness of the tree-like proofs that DPLL generates is that they do not reuse derived clauses. The conflict clauses found when DPLL is augmented by clause learning correspond to reuse of derived clauses in the associated resolution proofs and thus to more general forms of resolution proofs. As a theoretical upper bound, all DPLL based approaches, including those involving clause learning, are captured by RES. An intuition behind the results in this paper is that the addition of clause learning moves DPLL closer to RES while retaining its practical efficiency.

It has been previously observed that clause learning can be viewed as adding resolvents to a tree-like proof (Marques-Silva, 1998). However, this paper provides the first mathematical proof that clause learning, viewed as a propositional proof system CL, is exponentially stronger than tree-like resolution. This explains, formally, the performance gains observed empirically when clause learning is added to DPLL based solvers. Further, we describe a generic way of extending families of formulas to obtain ones that exponentially separate CL from many refinements of resolution known to be intermediate in strength between RES and tree-like resolution. These include regular and Davis-Putnam resolution, and any other proper refinement of RES that behaves naturally under restrictions of variables, i.e., for any formula F and restriction $\rho$ on its variables, the shortest proof of F| in the system is not any larger than a proof of F itself. The argument used to prove this result involves a new clause learning scheme called FirstNewCut that we introduce specifically for this purpose. Our second technical result shows that combining a slight variant of CL, denoted CL-, with unlimited restarts results in a proof system as strong as RES itself. This intuitively explains the speed-ups obtained empirically when randomized restarts are added to DPLL based solvers, with or without clause learning.

Given these results about the strengths and limitations of clause learning, it is natural to ask how the understanding we gain through this kind of analysis may lead to practical improvement in SAT solvers. The theoretical bounds tell us the potential power of clause learning; they don't give us a way of finding short solutions when they exist. In order to leverage their strength, clause learning algorithms must follow the ``right'' variable order for their branching decisions for the underlying DPLL procedure. While a good variable order may result in a polynomial time solution, a bad one can make the process as slow as basic DPLL without learning. The latter half of this paper addresses this problem of moving from analytical results to practical improvement. The approach we take is the use of the problem structure for guiding SAT solvers in their branch decisions.

Both random CNF formulas and those encoding various real-world problems are quite hard for current SAT solvers. However, while DPLL based algorithms with lookahead but no learning (such as satz by Li and Anbulagan 1997) and those that try only one carefully chosen assignment without any backtracks (such as SurveyProp by Mézard and Zecchina 2002) are our best tools for solving random formula instances, formulas arising from various real applications seem to require clause learning as a critical ingredient. The key thing that makes this second class of formulas different is the inherent structure, such as dependence graphs in scheduling problems, causes and effects in planning, and algebraic structure in group theory.

Most theoretical and practical problem instances of satisfiability problems originate, not surprisingly, from a higher level description, such as planning domain definition language or PDDL specification for planning, timed automata or logic description for model checking, task dependency graph for scheduling, circuit description for VLSI, algebraic structure for group theory, and processor specification for hardware. Typically, this description contains more structure of the original problem than is visible in the flat CNF representation in DIMACS format (Johnson and Trick, 1996) to which it is converted before being fed into a SAT solver. This structure can potentially be used to gain efficiency in the solution process. While there has been work on extracting structure after conversion into a CNF formula by exploiting variable dependency (Giunchiglia et al., 2002, Ostrowski et al., 2002), constraint redundancy (Ostrowski et al., 2002), symmetry (Aloul et al., 2002), binary clauses (Brafman, 2001), and partitioning (Amir and McIlraith, 2000), using the original higher level description itself to generate structural information is likely to be more effective. The latter approach, despite its intuitive appeal, remains largely unexplored, except for suggested use in bounded model checking (Shtrichman, 2000) and the separate consideration of cause variables and effect variables in planning (Kautz and Selman, 1996).

In this paper, we further open this line of research by proposing an effective method for exploiting problem structure to guide the branching decision process of clause learning algorithms. Our approach uses the original high level problem description to generate not only a CNF encoding but also a branching sequence that guides the SAT solver toward an efficient solution. This branching sequence serves as auxiliary structural information that was possibly lost in the process of encoding the problem as a CNF formula. It makes clause learning algorithms learn useful clauses instead of wasting time learning those that may not be reused in future at all. We give an exact sequence generation algorithm for pebbling formulas, using the underlying pebbling graph as the high level description. We also give a much simpler but approximate branching sequence generation algorithm for GT_n formulas, utilizing their underlying ordering structure. Our sequence generators work for the 1UIP learning scheme (Zhang et al., 2001), which is one of the best known. They can also be extended to other schemes, including FirstNewCut. Our empirical results are based on our extension of the popular SAT solver zChaff (Moskewicz et al., 2001).

We show that the use of branching sequences produced by our generator leads to exponential empirical speedups for the class of grid and randomized pebbling formulas. These formulas, more commonly occurring in theoretical proof complexity literature (Ben-Sasson, 2000, Beame et al., 2003a), can be thought of as representing precedence graphs in dependent task systems and scheduling scenarios. They can also be viewed as restricted planning problems. Although admitting a polynomial size solution, both grid and randomized pebbling problems are not so easy to solve deterministically, as is indicated by our experimental results for unmodified zChaff. We also report significant gains obtained for the class of GT_n formulas which, again, have appeared frequently in proof complexity results (Krishnamurthy, 1985, Alekhnovich et al., 2002, Bonet and Galesi, 2001). From a broader perspective, our results for pebbling and GT_n formulas serve as a proof of concept that analysis of problem structure can be used to achieve dramatic improvements even in the current best clause learning based SAT solvers.