Search code examples
algorithmschemeknuth-morris-pratt

Knuth-Morris-Pratt algorithm in Scheme


This is the code to calculate the failure function (how many steps we have to go back) in Scheme, when we use the Knuth-Morris-Pratt algorithm:

(define (compute-failure-function p)
    (define n-p (string-length p))
    (define sigma-table (make-vector n-p 0))
    (let loop
        ((i-p 2)
         (k 0))
      (cond
          ((>= i-p n-p)
           (vector-set! sigma-table (- n-p 1) k))
          ((eq? (string-ref p k)
                (string-ref p (- i-p 1)))
           (vector-set! sigma-table i-p (+ k 1))
           (loop (+ i-p 1) (+ k 1)))
          ((> k 0)
           (loop i-p (vector-ref sigma-table k)))
          (else ; k=0
           (vector-set! sigma-table i-p 0)
           (loop (+ i-p 1) k))))
    (vector-set! sigma-table 0 -1)
    (lambda (q)
        (vector-ref sigma-table q)))

But I do not understand the part when k > 0. Can someone explain it please?


Solution

  • I see you're confused with the syntax of a named let. This post does a good job explaining how it works, but perhaps an example with more familiar syntax will make things clearer. Take this code in Python, it adds all integers from 1 to 10:

    sum = 0
    n = 1
    while n <= 10:
      sum += n
      n += 1
    
    print(sum)
    => 55
    

    Now let's try to write it in a recursive fashion, I'll call my function loop. This is completely equivalent:

    def loop(n, sum):
        if n > 10:
            return sum
        else:
            return loop(n + 1, n + sum)
    
    loop(1, 0)
    => 55
    

    In the above example, the loop function implements an iteration, the parameter n is used to keep track of the current position, and the parameter sum accumulates the answer. Now let's write the exact same code, but in Scheme:

    (let loop ((n 1) (sum 0))
      (cond ((> n 10) sum)
            (else (loop (+ n 1) (+ n sum)))))
    => 55
    

    Now we've defined a local procedure called loop which is then automatically called with the initial values 1 and 0 for its parameters n and sum. When the base case of the recursion is reached, we return sum, otherwise we keep calling this procedure, passing updated values for the parameters. It's exactly the same as in the Python code! Don't be confused by the syntax.

    In your algorithm, i-p and k are the iteration variables, which are initialized to 2 and 0 respectively. Depending on which condition is true, the iteration continues when we call loop again with updated values for i-p and k, or it ends when the case (>= i-p n-p) is reached, at this point the loop exits and the computed value is in the variable sigma-table. The procedure ends by returning a new function, referred to as the "failure function".