Search code examples
computer-scienceschemesicpunification

Seemingly unnecessary case in the unification algorithm in SICP


I'm trying to understand the unification algorithm described in SICP here

In particular, in the procedure "extend-if-possible", there's a check (the first place marked with asterix "*") which is checking to see if the right hand "expression" is a variable that is already bound to something in the current frame:

(define (extend-if-possible var val frame)
  (let ((binding (binding-in-frame var frame)))
    (cond (binding
       (unify-match
        (binding-value binding) val frame))
      ((var? val)                      ; *** why do we need this?
       (let ((binding (binding-in-frame val frame)))
         (if binding
             (unify-match
              var (binding-value binding) frame)
             (extend var val frame))))
      ((depends-on? val var frame)
       'failed)
      (else (extend var val frame)))))

The associated commentary states:

"In the first case, if the variable we are trying to match is not bound, but the value we are trying to match it with is itself a (different) variable, it is necessary to check to see if the value is bound, and if so, to match its value. If both parties to the match are unbound, we may bind either to the other."

However, I cannot think of a case where this is actually necessary.

What's it's talking about, I think, is where you might have the following frame bindings currently present:

{?y = 4}

And are then asked to "extendIfPossible" a binding from ?z to ?y.

With that "*" check present, when asked to extend "?z" with "?y", we see that "?y" is already bound to 4, and then recursively try to unify "?z" with "4", which results with us extending the frame with "?z = 4".

Without the check, we would end up extending the frame with just "?z = ?y". But in both cases, so long as ?z is not already bound to something else, I don't see the problem.

Note, if ?z had already been bound to something else then the code doesn't reach the part marked "*" (we would have already recursed to unifying with what ?z had already been matched to).

After thinking it over, I have realised that there might be some sort of argument for generating a "simplest" MGU (Most General Unifier). e.g. you might want an MGU with a minimal number of variables referring to other variables... that is, we'd rather generate the substitution {?x = 4, ?y = 4} than the substitution {?x = ?y, ?y = 4}... but I don't think this algorithm would guarantee this behaviour in any case, because if you asked it to unify "(?x 4)" with "(?y ?y)" then you would still end up with {?x = ?y, ?y = 4}. And if the behaviour can't be guaranteed, why the additional complexity?

Is my reasoning here all correct? If not, what's a counter example where the "*" check is necessary to generate a correct MGU?


Solution

  • That's a good question!

    I think the reason is that you don't want to end up with circular bindings such as { ?x = ?y, ?y = ?x }. In particular, unifying (?x ?y) with (?y ?x) would give you the circular frame above if you omitted the check. With the check, you get the frame { ?x = ?y } as expected.

    Circular bindings in a frame are bad, because they may cause functions performing substitutions using the frame, such as instantiate, to run in an infinite loop.