haskell sequence infinite self-reference recursive-datastructures

Infinite self-referencing list

Problem

I'm trying to implement the modified Dragon Curve from AoC Day 16 as an infinite list in Haskell.

The list is composed of True and False. We start with some list s0:

s1 = s0 ++ [False] ++ (map not . reverse) s0
s2 = s1 ++ [False] ++ (map not . reverse) s1
s3 = s2 ++ [False] ++ (map not . reverse) s2

Generally

sn = s(n-1) ++ [0] ++ (map not . reverse) s(n-1) 
   = s0 ++ [0] ++ (f s0) ++ [0] ++ (f (s0 ++ [0] ++ (f s0))) ++ ...
      where f = (map not . reverse)

Attempted Implementation

I can get sn quite easily using the iterate function.

modifiedDragonCurve :: [Bool] -> Int -> [Bool]
modifiedDragonCurve s n = (iterate f s)!!n
    where f s     = s ++ [False] ++ (map not . reverse) s

This gives me a list [s0, s1, s2, ...]. However, since s(n-1) is a prefix of sn this could be built as an infinite list, but i cannot figure out how to approach it. I think I need something along the lines of

modifiedDragonCurve :: [Bool] -> [Bool]
modifiedDragonCurve s = s ++ [False] ++ (map not . reverse) listSoFar

But cannot figure out how to refer to the already generated list (listSoFar).

Any suggestions would be greatly appreciated.

Solution

I played with this myself while solving the AoC problem, too. I found a remarkable solution that does not require reverse, hence is more memory-friendly and speedy than the other solutions listed here. It's also beautiful! The dragon curve itself is a nice short two-liner:

merge (x:xs) ys = x:merge ys xs
dragon = merge (cycle [False, True]) dragon

It can be extended to use a "seed" as the AoC problem demands just by alternating between the seed and the bits of the true dragon curve:

infinite bs = go bs (map not (reverse bs)) dragon where
    go bs bs' (d:ds) = bs ++ [d] ++ go bs' bs ds

(This does call reverse once -- but unlike other solutions, it is called just once on a chunk of data about the size of the input, and not repeatedly on chunks of data about as large as the part of the list you consume.) Some timings to justify my claims; all versions used to produce 2^25 elements with an empty seed, compiled with ghc -O2, and timed with /usr/bin/time.

freestyle's solution takes 11.64s, ~1.8Gb max resident
David Fletcher's solution takes 10.71s, ~2Gb max resident
luqui's solution takes 9.93s, ~1GB max resident
my solution takes 8.87s, ~760MB max resident

The full test program was

main = mapM_ print . take (2^25) . dragon $ []

with dragon replaced by each implementation in turn. A carefully crafted consumer can lower memory usage even further: my best solution to the second-star problem so far runs in 5Mb real residency (i.e. including all the space GHC allocated from the OS for its multiple generations, slack space, and other RTS overhead), 60Kb GHC-reported residency (i.e. just the space used by not-yet-GC'd objects, regardless of how much space GHC has allocated from the OS).

For raw speed, though, you can't beat an unboxed mutable vector of Bool: a coworker reports that his program using such ran in 0.2s, using about 35Mb memory to store the complete expanded (but not infinite!) vector.