difference in encoding of the same axiom

I'm wondering what is the difference between these two encoding of the same list axiom:

(define-sort T1 () Int)

(declare-fun list_length ( (List T1) ) Int)

(assert (forall ( (i T1) (l (List T1)) )
                (ite (= l (as nil (List T1)))
                     (= (list_length l) 0)
                     (= (list_length (insert i l)) (+ 1 (list_length l))))))

and

(define-sort T1 () Int)

(declare-fun list_length ( (List T1) ) Int)

(assert (= (list_length (as nil (List T1))) 0))

(assert (forall ( (i T1) (l (List T1)) )
                (= (list_length (insert i l)) (+ 1 (list_length l)))))

For this benchmark:

(declare-const a T1)
(declare-const b T1)

(assert (not
         (= (list_length (insert b (insert a (as nil (List T1))))) 2)))

(check-sat)

Somehow z3 is able to reason about the second version but not the first (where it seems to just loop forever).

Edit: same with cvc4 with the first version returning unknown.

Solution

First-order logic with quantifiers is essentially semi-decidable. In the SMT context, this means that there is no decision procedure to answer every query correctly as sat/unsat.

(Theoretical aside, not that it's that important: If you completely ignore efficiency considerations, then there are algorithms that can answer all satisfiable queries correctly, but there are no algorithms that can correctly deduce unsat. In this latter case, they'd loop forever. But this is a digression.)

So, to deal with quantifiers, SMT solvers usually employ a technique known as E-matching. Essentially, when they form a ground term mentioning uninterpreted functions, they try to instantiate quantified axioms to match them and rewrite accordingly. This technique can be quite effective in practice and scales well with typical software verification problems, but it obviously is not a panacea. For details, see this paper: https://pdfs.semanticscholar.org/4eb2/c5e05ab5c53f20c6050f8252a30cc23561be.pdf.

Regarding your question: Essentially, when you have the ite form of the axiom, the e-matching algorithm simply fails to find the proper substitution to instantiate your axiom. For efficiency considerations, the e-matcher really looks at almost "exact" matches. (Take this with a grain of salt; it's smarter than that, but not by much.) Being too smart here hardly ever pays off in practice, since you can end up generating way too many matchings and end up exploding your search space. As usual, it's a balance between practicality, efficiency, and covering as many cases as possible.

Z3 allows specifying patterns to guide that search to a certain extent, but patterns are rather tricky to use and fragile. (I'd have pointed you to the right place in the documentation for patterns, alas the z3 documentation site is down for the time being as you yourself noticed!) You might want to play around with them to see if they give you better results. But the rule of thumb is to keep your quantified axioms as simple and obvious as possible. And your second variant precisely does that, as compared to the first. For this particular problem, definitely split the axiom into two parts, and assert both separately to cover the nil/insert cases. Combining them into one rule simply exceeds the capabilities of the current e-matcher.