Search code examples
haskellrepaaccelerate-haskell

Emulating non-rectangular arrays


Often times you want the performance of arrays over linked lists while having not conforming to the requirement of having rectangular arrays.

As an example consider an hexagonal grid, here shown with the 1-distance neighbors of cell (3, 3) in medium gray and the 2-distance neighbors in light gray. enter image description here Say we want an array that contains, for each cell, the indices of every 1- and 2-distance neighbor for that cell. One slight issue is that cells have a different amount of X-distance neighbors -- cells on the grid border will have fewer neighbors than cells closer to the grid center.

(We want an array of neighbor indices --- instead of a function from cell coordinates to neighbor indices --- for performance reasons.)

We can work around this problem by keeping track of how many neighbors each cell has. Say you have an array neighbors2 of size R x C x N x 2, where R is the number of grid rows, C for columns, and N is the maximum number of 2-distance neighbors for any cell in the grid. Then, by keeping an additional array n_neighbors2 of size R x C, we can keep track of which indices in neighbors2 are populated and which are just zero padding. For example, to retrieve the the 2-distance neighbors of cell (2, 5), we simply index into the array as such:

someNeigh = neighbors2[2, 5, 0..n_neighbors2[2, 5], ..]

someNeigh will be a n_neighbors2[2, 5] x 2 array (or view) of indicies, where someNeigh[0, 0] yields the row of the first neighbor, and someNeigh[0, 1] yields the column of the first neighbor and so forth. Note that the elements at the positions

neighbors2[2, 5, n_neighbors2[2, 5]+1.., ..]

are irrelevant; this space is just padding to keep the matrix rectangular.

Provided we have a function for finding the d-distance neighbors for any cell:

import           Data.Bits (shift)
rows, cols = (7, 7)
type Cell = (Int, Int)

generateNeighs :: Int -> Cell -> [Cell]
generateNeighs d cell1 = [ (row2, col2)
                            | row2 <- [0..rows-1]
                            , col2 <- [0..cols-1]
                            , hexDistance cell1 (row2, col2) == d]

hexDistance :: Cell -> Cell -> Int
hexDistance (r1, c1) (r2, c2) = shift (abs rd + abs (rd + cd) + abs cd) (-1)
  where
    rd = r1 - r2
    cd = c1 - c2

How can we create the aforementioned arrays neighbors2 and n_neighbors2? Assume we know the maximum amount of 2-distance neighbors N beforehand. Then it is possible to modify generateNeighs to always return lists of the same size, as we can fill up remaining entries with (0, 0). That leaves, as I see it, two problems:

  • We need a function to populate neighbors2 which operates not every individual index but on a slice, in our case it should fill one cell at a time.
  • n_neighbors2 should be populated simultaneously as neighbors2

A solution is welcome with either repa or accelerate arrays.


Solution

  • Here's you picture skewed 30 degrees to the right:

    enter image description here

    As you can see your array is actually perfectly rectangular.

    The indices of a neighborhood's periphery are easily found as six straight pieces around the chosen center cell, e.g. (imagine n == 2 is the distance of the periphery from the center (i,j) == (3,3) in the picture):

    periphery n (i,j) = 
       --     2 (3,3)
      let 
        ((i1,j1):ps1) = reverse . take (n+1) . iterate (\(i,j)->(i,j+1)) $ (i-n,j) 
        --                                                                 ( 1, 3)
        ((i2,j2):ps2) = reverse . take (n+1) . iterate (\(i,j)->(i+1,j)) $ (i1,j1) 
        --                                                                 ( 1, 5)
        .....
        ps6           = .......                                          $ (i5,j5)
      in filter isValid (ps6 ++ ... ++ ps2 ++ ps1)
    

    The whole neighborhood is simply

    neighborhood n (i,j) = (i,j) : concat [ periphery k (i,j) | k <- [1..n] ]
    

    For each cell/distance combination, simply generate the neighborhood indices on the fly and access your array in O(1) time for each index pair.