arrays haskell data-structures lazy-evaluation repa

Appropriate representation for lazy array in Haskell

I'm trying to make a program that has massive updates on one large array, but evaluates only few times. I want computations to be lazy as possible, but I can't find out which Array representation is good for my case. To be specific, I want my array to:

have fixed size
access in constant time
update n elements in O(n)
lazy evaluated

How to approach these requirements? As a sub-question: are there libraries specifically geared for this use case?

EDIT:

Maybe my question wasn't specific enough, so I'll try to explain about my case more.

I'm trying to represent various size of image which can be either relatively small(about 1200x800), or much larger than that(at least 8000x8000). Also, there will be lots of layers for one image, which means that if I want to draw image on screen, there will be lots of updates on frame buffer image. I thought if I can exploit lazy-evaluated nature of haskell, I would be able to write on frame buffer only once, rather than overwriting same pixel at each updates.

I'm aware of several options for representing array in haskell, but all of those doesn't seem to fit in my case. For example:

Data.Seq, Data.IntTrie : can't be accessed in constant time
Data.Vector, Data.Array : update n elements takes more than O(n)
Unboxed variants : not lazy-evaluated(I guess?)

What approach should I take in this situation?

Solution

One library to consider is repa. One of the key ideas in repa is that the Array type is parametrized over its underlying representation. A sample of these is

the delayed representation D for arrays that "aren't real" - they just compute elements on demand;
the unboxed vector representations U for unboxed data
the boxed vector representation V for boxed data (use only if you can't make an Unbox instance of your data)
a reference to a foreign buffer F, especially useful if you want to be writing into some foreign bitmap graphics buffer

When you compile your code, delayed arrays will get optimized away. It goes without saying that the unboxed arrays have O(1) access. For your case, you will probably just be using arrays of shape DIM2 (2D arrays).

For demonstration purposes, here is a program that does something similar to what you want: It takes in a list of bitmap layers and their offsets, as well as background bitmap. Then, it draws these layers on top of the bitmap background. It depends on repa-io for loading the images into repa arrays.

import Data.Array.Repa.IO.BMP
import Data.Array.Repa.Shape
import Data.Array.Repa
import Data.Word

-- This is our internal representation of a bitmap
type Image s = Array s DIM2 (Word8, Word8, Word8)
data Layer s = Layer { image :: Image s, offset :: (Int, Int) }

drawLayersOnBg :: Image U -> [Layer U] -> Image D
drawLayersOnBg background layers = foldl (\bg (Layer im (x,y)) -> overlay bg (ix2 y x) im) (delay background) layers
  where
    overlay :: Image D -> DIM2 -> Image U -> Image D
    overlay bg off ov = backpermuteDft bg
                                       (\i -> let i' = i `subDim` off
                                                 in if extent ov `inShape` I'
                                                      then Just i'
                                                      else Nothing)
                                       ov

    subDim :: DIM2 -> DIM2 -> DIM2
    subDim (Z :. y :. x) (Z :. dy :. dx) = ix2 (y - dy) (x - dx)

main = do
  Right bg <- readImageFromBMP "background.bmp"
  Right l1 <- readImageFromBMP "layer1.bmp"
  Right l2 <- readImageFromBMP "layer2.bmp"
  Right l3 <- readImageFromBMP "layer3.bmp"

  out <- computeP $ drawLayersOnBg bg [ Layer l1 (200,100) , Layer l2 (100, 200), Layer l3 (-300,400) ]
  writeImageToBMP "out.bmp" out

With some random bitmaps from the internet I got this as an output:

If I wanted to, I could even have used fromForeignPtr to read bitmaps straight from some foreign memory buffer and computeIntoP instead of computeP to write the bitmap output straight into some foreign memory buffer (no intermediate allocations). If you need high-performance, those are especially interesting.