Search code examples
z3z3py

How to efficiently assert that two large sets don't intersect?


I have two large collections of Z3 integer variables. Call them collection A and collection B. (Group membership is known in advance, so there's no need to use Z3 sets.) I need to generate assertions that ensure no element in A is equal to an element in B. The obvious way is the following:

for all a in A:
  for all b in B:
    solver.add(a != b);

However, the collections are large and this would add over 20 million assertions, so it's not an option.

Here's another approach I came up with that only involves assert a total of O(n+m) clauses:

a = ctx.int_const("a");
a_def = (a == A[0] || a == A[1] || ... || A[n]);

b = ctx.int_const("b");
b_def = (b == B[0] || b == B[1] || ... || B[m]);

solver.add(z3::forall(a, b, z3::implies(a_def && b_def, a != b)));

Is there a more efficient way to do this? It seems like the above approach presents the relationship between A and B in an indirect way to the solver, which I worry will hurt performance.


Solution

  • I think your best bet is to directly use Distinct (https://z3prover.github.io/api/html/namespacez3py.html#a9eae89dd394c71948e36b5b01a7f3cd0):

    s.add(Distinct(*(A+B)))
    

    And z3 will internally handle this (hopefully!) with a pseudo-boolean encoding and be quite efficient. Note that this is just one clause, though internally z3 will transform it to an efficient form.

    Another option, since you're already starting with the assumption that A and B are sets, is to use min(m,n) calls to Distinct:

    for a in A:
        s.add(Distinct(a, *B))
    

    obviously, do this loop over the shorter list; thus creating min(m,n) assertions.

    Your O(m+n) solution can work pretty well too. I don't see any obvious reasons why it shouldn't. (Although, the presence of quantifiers make life hard for SMT solvers. So, your mileage might vary, depending on the other constraints in the system.)

    With anything performance related, it's best to do some experiments and see what works the best. I think it'll really depend on what other constraints you have on these elements and how well these new assertions will play with them. Without knowing your entire constraint set, it's anyone's guess.