Search code examples
haskellrecursionlist-comprehension

Understanding infinite list of lists with list comprehension and recursion


I'm learning Haskell coming from an imperative (mostly C) programming pov, with some basic understanding of Prolog. While trying to solve some exercises I've come across one that has bugged me for some time. After finding the answer provided, more doubts arose. I have to generate an infinite list of lists starting from an alphabet given as input (also known as kleene closure). The function is defined as follows:

kleene :: [a] -> [[a]]
kleene xs = []:[kle++[x] | kle <- kleene xs, x <- xs]

It works, but I do not seem to grasp how it works (which is the main point of the exercise I think). From what I understand from list comprehensions, I take the first element of the first list (here kleene xs) and then go through the elements of the second list (xs). This example shows the first 20 elements of the kleene closure.

> take 20 $ kleene [1,2,3]
> [[],[1],[2],[3],[1,1],[1,2],[1,3],[2,1],[2,2],[2,3],[3,1],[3,2],[3,3],[1,1,1],[1,1,2],[1,1,3],[1,2,1],[1,2,2],[1,2,3],[1,3,1]]

Now, what's going on here? Why the evaluation of this does not spiral out of control? Is it because of lazy evaluation, that allows to go on without really computing if not necessary? If I try to substitute and evaluate the expression, I get stuck in a loop:

[[],[kleene [1,2,3],1],[kleene [1,2,3],2],[kleene [1,2,3],3],[kleene [1,2,3],...]

To compute kleene I have to infinitely compute it. In the first row, once I resolve kleene I get

[[],[[],kleene [1,2,3],1,1],[[], kleene [1,2,3],1,2],[[], kleene [1,2,3],1,3],...]

But then it should go on for ever. Why do I get an output that is progressively bigger instead of infinitely big elements? Is it because of Lazy Evaluation, that each recursive call counts as an element of the list, so that the next element is of kle is only the subsequent recursive call, while carrying the elements of xs every time? Can someone help me out understanding this?


Solution

  • Is it because of lazy evaluation, that allows to go on without really computing if not necessary?

    More or less, yes.

    If you call kleene xs it returns [] : […], so it is a "cons" (a list data object) where the head is known, an empty list, and something that still needs to be evaluated).

    This thus means that it can produce the first item, without any problem. Which thus can be used in the list comprehension.

    The first item for kleene xs is thus [], for any value of xs, so in the list comprehension []:[kle++[x] | kle <- kleene xs, x <- xs] it thus takes kle = [] and then starts enumerating over the lists, so if xs = [1,2], then it will produce kle ++ [1] and kle ++ [2], so [1] and [2]. Then it moves to the next item, this will force producing the next item in the recursive call, but we can, these are [1] and [2] and these can be produced without any recursive call to kleene xs. After that it thus will make an extra recursive call.

    This makes it not very efficient. It will grow with 𝓞(log2n) levels of recursive calls for a list with length n. We can optimize this by simply referring to the same list we are constructing. Indeed:

    kleene :: [a] -> [[a]]
    kleene xs = go
      where go = [] : [kle++[x] | kle <- go, x <- xs]
    

    Now we thus only make one call and the producer of the next items of the list, will also read items from the same list. The downside of this is that it will require memory. Indeed, it will hold all the items between the current position we need for kle <- go, and the one we are producing.