Search code examples
arrayshaskelldata-structureslazy-evaluationrepa

Appropriate representation for lazy array in Haskell


I'm trying to make a program that has massive updates on one large array, but evaluates only few times. I want computations to be lazy as possible, but I can't find out which Array representation is good for my case. To be specific, I want my array to:

  • have fixed size
  • access in constant time
  • update n elements in O(n)
  • lazy evaluated

How to approach these requirements? As a sub-question: are there libraries specifically geared for this use case?


EDIT:

Maybe my question wasn't specific enough, so I'll try to explain about my case more.

I'm trying to represent various size of image which can be either relatively small(about 1200x800), or much larger than that(at least 8000x8000). Also, there will be lots of layers for one image, which means that if I want to draw image on screen, there will be lots of updates on frame buffer image. I thought if I can exploit lazy-evaluated nature of haskell, I would be able to write on frame buffer only once, rather than overwriting same pixel at each updates.

I'm aware of several options for representing array in haskell, but all of those doesn't seem to fit in my case. For example:

  • Data.Seq, Data.IntTrie : can't be accessed in constant time
  • Data.Vector, Data.Array : update n elements takes more than O(n)
  • Unboxed variants : not lazy-evaluated(I guess?)

What approach should I take in this situation?


Solution

  • One library to consider is repa. One of the key ideas in repa is that the Array type is parametrized over its underlying representation. A sample of these is

    • the delayed representation D for arrays that "aren't real" - they just compute elements on demand;
    • the unboxed vector representations U for unboxed data
    • the boxed vector representation V for boxed data (use only if you can't make an Unbox instance of your data)
    • a reference to a foreign buffer F, especially useful if you want to be writing into some foreign bitmap graphics buffer

    When you compile your code, delayed arrays will get optimized away. It goes without saying that the unboxed arrays have O(1) access. For your case, you will probably just be using arrays of shape DIM2 (2D arrays).

    For demonstration purposes, here is a program that does something similar to what you want: It takes in a list of bitmap layers and their offsets, as well as background bitmap. Then, it draws these layers on top of the bitmap background. It depends on repa-io for loading the images into repa arrays.

    import Data.Array.Repa.IO.BMP
    import Data.Array.Repa.Shape
    import Data.Array.Repa
    import Data.Word
    
    -- This is our internal representation of a bitmap
    type Image s = Array s DIM2 (Word8, Word8, Word8)
    data Layer s = Layer { image :: Image s, offset :: (Int, Int) }
    
    drawLayersOnBg :: Image U -> [Layer U] -> Image D
    drawLayersOnBg background layers = foldl (\bg (Layer im (x,y)) -> overlay bg (ix2 y x) im) (delay background) layers
      where
        overlay :: Image D -> DIM2 -> Image U -> Image D
        overlay bg off ov = backpermuteDft bg
                                           (\i -> let i' = i `subDim` off
                                                     in if extent ov `inShape` I'
                                                          then Just i'
                                                          else Nothing)
                                           ov
    
        subDim :: DIM2 -> DIM2 -> DIM2
        subDim (Z :. y :. x) (Z :. dy :. dx) = ix2 (y - dy) (x - dx)
    
    main = do
      Right bg <- readImageFromBMP "background.bmp"
      Right l1 <- readImageFromBMP "layer1.bmp"
      Right l2 <- readImageFromBMP "layer2.bmp"
      Right l3 <- readImageFromBMP "layer3.bmp"
    
      out <- computeP $ drawLayersOnBg bg [ Layer l1 (200,100) , Layer l2 (100, 200), Layer l3 (-300,400) ]
      writeImageToBMP "out.bmp" out
    

    With some random bitmaps from the internet I got this as an output:

    output

    If I wanted to, I could even have used fromForeignPtr to read bitmaps straight from some foreign memory buffer and computeIntoP instead of computeP to write the bitmap output straight into some foreign memory buffer (no intermediate allocations). If you need high-performance, those are especially interesting.