Search code examples
recursionclojureaccumulatorcons

Accumulators, conj and recursion


I've solved 45 problems from 4clojure.com and I noticed a recurring problem in the way I try to solve some problems using recursion and accumulators.

I'll try to explain the best I can what I'm doing to end up with fugly solutions hoping that some Clojurers would "get" what I'm not getting.

For example, problem 34 asks to write a function (without using range) taking two integers as arguments and creates a range (without using range). Simply put you do (... 1 7) and you get (1 2 3 4 5 6).

Now this question is not about solving this particular problem.

What if I want to solve this using recursion and an accumulator?

My thought process goes like this:

  • I need to write a function taking two arguments, I start with (fn [x y] )

  • I'll need to recurse and I'll need to keep track of a list, I'll use an accumulator, so I write a 2nd function inside the first one taking an additional argument:

    (fn [x y]
    ((fn g [x y acc] ...) x y '())

(apparently I can't properly format that Clojure code on SO!?)

Here I'm already not sure I'm doing it correctly: the first function must take exactly two integer arguments (not my call) and I'm not sure: if I want to use an accumulator, can I use an accumulator without creating a nested function?

Then I want to conj, but I cannot do:

(conj 0 1)

so I do weird things to make sure I've got a sequence first and I end up with this:

(fn
   [x y]
   ((fn g [x y acc] (if (= x y) y (conj (conj acc (g (inc x) y acc)) x)))
    x
    y
    '()))

But then this produce this:

(1 (2 (3 4)))

Instead of this:

(1 2 3 4)

So I end up doing an additional flatten and it works but it is totally ugly.

I'm beginning to understand a few things and I'm even starting, in some cases, to "think" in a more clojuresque way but I've got a problem writing the solution.

For example here I decided:

  • to use an accumulator
  • to recurse by incrementing x until it reaches y

But I end up with the monstrosity above.

There are a lot of way to solve this problem and, once again, it's not what I'm after.

What I'm after is how, after I decided to cons/conj, use an accumulator, and recurse, I can end up with this (not written by me):

#(loop [i %1
        acc nil]
  (if (<= %2 i)
    (reverse acc)
    (recur (inc i) (cons i acc))))

Instead of this:

((fn
  f
  [x y]
  (flatten
   ((fn
     g
     [x y acc]
     (if (= x y) acc (conj (conj acc (g (inc x) y acc)) x)))
    x
    y
    '())))
 1
 4)

I take it's a start to be able to solve a few problems but I'm a bit disappointed by the ugly solutions I tend to produce...


Solution

  • i think there are a couple of things to learn here.

    first, a kind of general rule - recursive functions typically have a natural order, and adding an accumulator reverses that. you can see that because when a "normal" (without accumulator) recursive function runs, it does some work to calculate a value, then recurses to generate the tail of the list, finally ending with an empty list. in contrast, with an accumulator, you start with the empty list and add things to the front - it's growing in the other direction.

    so typically, when you add an accumulator, you get a reversed order.

    now often this doesn't matter. for example, if you're generating not a sequence but a value that is the repeated application of a commutative operator (like addition or multiplication). then you get the same answer either way.

    but in your case, it is going to matter. you're going to get the list backwards:

    (defn my-range-0 [lo hi] ; normal recursive solution
      (if (= lo hi)
        nil
        (cons lo (my-range-0 (inc lo) hi))))
    
    (deftest test-my-range-1
      (is (= '(0 1 2) (my-range-0 0 3))))
    
    (defn my-range-1 ; with an accumulator
      ([lo hi] (my-range-1 lo hi nil))
      ([lo hi acc]
        (if (= lo hi)
          acc
          (recur (inc lo) hi (cons lo acc)))))
    
    (deftest test-my-range-1
      (is (= '(2 1 0) (my-range-1 0 3)))) ; oops!  backwards!
    

    and often the best you can do to fix this is just reverse that list at the end.

    but here there's an alternative - we can actually do the work backwards. instead of incrementing the low limit you can decrement the high limit:

    (defn my-range-2
      ([lo hi] (my-range-2 lo hi nil))
      ([lo hi acc]
        (if (= lo hi)
          acc
          (let [hi (dec hi)]
            (recur lo hi (cons hi acc))))))
    
    (deftest test-my-range-2
      (is (= '(0 1 2) (my-range-2 0 3)))) ; back to the original order
    

    [note - there's another way of reversing things below; i didn't structure my argument very well]

    second, as you can see in my-range-1 and my-range-2, a nice way of writing a function with an accumulator is as a function with two different sets of arguments. that gives you a very clean (imho) implementation without the need for nested functions.


    also you have some more general questions about sequences, conj and the like. here clojure is kind-of messy, but also useful. above i've been giving a very traditional view with cons based lists. but clojure encourages you to use other sequences. and unlike cons lists, vectors grow to the right, not the left. so another way to reverse that result is to use a vector:

    (defn my-range-3 ; this looks like my-range-1
      ([lo hi] (my-range-3 lo hi []))
      ([lo hi acc]
        (if (= lo hi)
          acc
          (recur (inc lo) hi (conj acc lo)))))
    
    (deftest test-my-range-3 ; except that it works right!
      (is (= [0 1 2] (my-range-3 0 3))))
    

    here conj is adding to the right. i didn't use conj in my-range-1, so here it is re-written to be clearer:

    (defn my-range-4 ; my-range-1 written using conj instead of cons
      ([lo hi] (my-range-4 lo hi nil))
      ([lo hi acc]
        (if (= lo hi)
          acc
          (recur (inc lo) hi (conj acc lo)))))
    
    (deftest test-my-range-4
      (is (= '(2 1 0) (my-range-4 0 3))))
    

    note that this code looks very similar to my-range-3 but the result is backwards because we're starting with an empty list, not an empty vector. in both cases, conj adds the new element in the "natural" position. for a vector that's to the right, but for a list it's to the left.

    and it just occurred to me that you may not really understand what a list is. basically a cons creates a box containing two things (its arguments). the first is the contents and the second is the rest of the list. so the list (1 2 3) is basically (cons 1 (cons 2 (cons 3 nil))). in contrast, the vector [1 2 3] works more like an array (although i think it's implemented using a tree).

    so conj is a bit confusing because the way it works depends on the first argument. for a list, it calls cons and so adds things to the left. but for a vector it extends the array(-like thing) to the right. also, note that conj takes an existing sequence as first arg, and thing to add as second, while cons is the reverse (thing to add comes first).


    all the above code available at https://github.com/andrewcooke/clojure-lab


    update: i rewrote the tests so that the expected result is a quoted list in the cases where the code generates a list. = will compare lists and vectors and return true if the content is the same, but making it explicit shows more clearly what you're actually getting in each case. note that '(0 1 2) with a ' in front is just like (list 0 1 2) - the ' stops the list from being evaluated (without it, 0 would be treated as a command).