Search code examples
z3z3py

What alternatives exist for representing sets in Z3?


Per this answer the Z3 set sort is implemented using arrays, which makes sense given the SetAdd and SetDel methods available in the API. It is also claimed here that if the array modification functions are never used, it's wasteful overhead to use arrays instead of uninterpreted functions. Given that, if my only uses of a set are to apply constraints with IsMember (either on individual values or as part of a quantification), is it a better idea to use an uninterpreted function mapping from the underlying element sort to booleans? So:

from z3 import *
s = Solver()
string_set = SetSort(StringSort())
x = String('x')
s.add(IsMember(x, string_set))

becomes

from z3 import *
s = Solver()
string_set = Function('string_set', StringSort(), BoolSort())
x = String('x')
s.add(string_set(x))

Are there any drawbacks to this approach? Alternative representations with even less overhead?


Solution

  • Those are really your only options, as long as you want to restrict yourself to the standard interface. In the past, I also had luck with representing sets (and in general relations) outside of the solver, keeping the processing completely outside. Here's what I mean:

    from z3 import *
    
    def FSet_Empty():
        return lambda x: False
    
    def FSet_Insert(val, s):
        return lambda x: If(x == val, True, s(val))
    
    def FSet_Delete(val, s):
        return lambda x: If(x == val, False, s(val))
    
    def FSet_Member(val, s):
        return s(val)
    
    x, y, z = Ints('x y z')
    
    myset = FSet_Insert(x, FSet_Insert(y, FSet_Insert(z, FSet_Empty())))
    
    s = Solver()
    s.add(FSet_Member(2, myset))
    
    print(s.check())
    print(s.model())
    

    Note how we model sets by unary relations, i.e., functions from values to booleans. You can generalize this to arbitrary relations and the ideas carry over. This prints:

    sat
    [x = 2, z = 4, y = 3]
    

    You can easily add union (essentially Or), intersection (essentially And), and complement (essentially Not) operations. Doing cardinality is harder, especially in the presence of complement, but that's true for all the other approaches too.

    As is usual with these sorts of modeling questions, there's no single approach that will work best across all problems. They'll all have their strengths and weaknesses. I'd recommend creating a single API, and implementing it using all three of these ideas, and benchmarking your problem domain to see what works the best; keeping in mind if you start working on a different problem the answer might be different. Please report your findings!