Search code examples
pythonz3or-toolsz3pysat

z3 much slower than ortools SAT. Why?


I am trying out different solvers for a toy clique problem and was surprised to find that ortools using SAT seems much faster than z3. I am wondering if I am doing something wrong given that z3 does so well in the published benchmarks. Here is an MWE:

First create a random graph with 150 vertices:

# First make a random graph and find the largest clique size in it
import igraph as ig
import random
from tqdm import tqdm
random.seed(7)
num_variables = 150  # bigger gives larger running time gap between z3 and ortools grow
print("Making graph")
g = ig.Graph.Erdos_Renyi(num_variables, 0.6)

# Make a set of edges. Maybe this isn't necessary
print("Making set of edges")
edges = set()
for edge in tqdm(g.es):
    edges.add((edge.source, edge.target))

Now use z3 to find the max clique size:

import z3
z3.set_option(verbose=1)
myVars = []
for i in range(num_variables):
    myVars += [z3.Int('v%d' % i)]

opt = z3.Optimize()

for i in range(num_variables):
    opt.add(z3.Or(myVars[i]==0, myVars[i] == 1))

for i in tqdm(range(num_variables)):
    for j in range(i+1, num_variables):
        if not (i, j) in edges:
            opt.add(myVars[i] + myVars[j] <= 1)

t = time()
h = opt.maximize(sum(myVars))
opt.check()
print(round(time()-t,2))

This takes around 70 seconds on my PC.

Now do the same thing using SAT from ortools.

from ortools.linear_solver import pywraplp
solver = pywraplp.Solver.CreateSolver('SAT')
solver.EnableOutput()

myVars = []

for i in range(num_variables):
        myVars += [solver.IntVar(0.0, 1.0, 'v%d' % i)]

for i in tqdm(range(num_variables)):
    for j in range(i+1, num_variables):
        if not (i, j) in edges:
            solver.Add(myVars[i] + myVars[j] <= 1)

print("Solving")
solver.Maximize(sum(myVars))
t = time()
status = solver.Solve()
print(round(time()-t,2))

This takes about 4 seconds on my PC.

If you increase num_variables the gap grows even larger.

Am I doing something wrong or is this just a really bad case for the z3 optimizer?


Update

It turns out that ortools using SAT is multi-core by default using 8 cores so the timings are unfair. Using @AxelKemper's improvement I now get (on a different machine):

  • Z3 time 88 seconds. If we assume perfect parallelisation that is 11 seconds on 8 cores.
  • ortools 4.3 seconds

So Z3 is only 2.5 times slower than ortools.

Update 2

Using @alias's improved code that uses s.add(z3.PbGe([(x, 1) for x in myVars], k)) it now takes 30.4 seconds which when divided by 8 is faster than ortools!


Solution

  • A custom tool like ortools is usually hard to beat, as it understands only a fixed number of domains, and they take advantage of parallel hardware. An SMT solver shines not at speed, but rather at what it allows: Combination of many many theories (arithmetic, data-structures, booleans, reals, integers, floats, etc.), so it's not a fair comparison.

    Having said that, we can make z3 go faster by using three ideas:

    • Use Bool instead of Int as Axel suggested. z3 is much better dealing with booleans as is, instead of coding them via integers. See this earlier answer for general advice on coding in z3.

    • z3's regular solver is much more adept at handling many constraints compared to the optimizer. While the optimizer definitely has its applications, avoid it if you can. If your problem can use an iterative approach (i.e., get a solution and keep improving by adding new constraints), it's definitely worth trying. Not all optimization problems are amenable to this sort of iteration based optimization of course, but whenever applicable, it can pay off. (One reason for this is that the optimization is a much more difficult problem that can get the solver bogged down in irrelevant details. Second reason is that z3's optimizer hasn't received much attention in recent years compared to the solver engines, so it's not keeping up with the improvements that were done to the mainline solver. At least this is my impression.)

    • Use pseudo-boolean equalities, for which z3 has internal tactics to deal with much more efficiently. For pseudo-boolean constraints, see this answer.

    Putting all these ideas together, here's how I'd code your problem in z3:

    import igraph as ig
    import random
    from time import *
    import z3
    
    # First make a random graph and find the largest clique size in it
    from tqdm import tqdm
    random.seed(7)
    num_variables = 150  # bigger gives larger running time gap between z3 and ortools grow
    print("Making graph")
    g = ig.Graph.Erdos_Renyi(num_variables, 0.6)
    
    # Make a set of edges. Maybe this isn't necessary
    print("Making set of edges")
    edges = set()
    for edge in tqdm(g.es):
        edges.add((edge.source, edge.target))
    
    myVars = []
    for i in range(num_variables):
        myVars += [z3.Bool('v%d' % i)]
    
    total = sum([z3.If(i, 1, 0) for i in myVars])
    
    s = z3.Solver()
    for i in tqdm(range(num_variables)):
        for j in range(i+1, num_variables):
            if not (i, j) in edges:
                s.add(z3.Not(z3.And(myVars[i], myVars[j])))
    
    t = time()
    curTotal = 0
    while True:
      s.add(z3.PbGe([(x, 1) for x in myVars], curTotal+1))
      r = s.check()
      if r == z3.unsat:
          break
      curTotal = s.model().evaluate(total, model_completion=True).as_long()
      print(r, curTotal, round(time()-t,2))
    
    print("DONE total time =", round(time()-t,2))
    

    Note the use of z3.PbGe which introduces the pseudo-boolean constraints. In my experiments this runs faster than the regular constraint version; and if you also assume 8-core full parallelization speed-up, I think it compares favorably to the ortools solution.

    Note that the iterative loop above is coded in a linear fashion, i.e., we ask for curTotal+1 solution, incrementing the last solution found by z3 only by one in each call to check. At the expense of some extra programming, you can also implement a binary-search like routine, i.e., ask for double the last result, and if unsat scale back to find the optimum, using push/pop to manage the assertion stack. This trick can perform even better when amortized over many runs.

    Please report the results of your own experiments!