Reducing memory use in grid movement combinatorics

I eventually want to expand to as big as I can get. Starting position is the 0th cell in red for now but I would start with a list of all the single cells. This is iteration 1.

A 4 by 4 grid, cells labeled and a red dot in the first cell.

I've marked cells that the starting position can visit green. You can only move up, down, left, or right, but you can move a variable distance. For example you can move one or 3 tiles to the right. Back to this example, I could move to 1, 2, 3, 4, 8, and 12 (I would repeat this for all 16 possible starting positions). As a set of pairs they are (0, 1), (0, 2), ... , (0, 12). This is the second iteration.

The same grid as above updated to reflect cells that can be visited.

Then for each pair get all possible triplets. In the example I'm using (0, 8). So they are (0, 1, 8), (0, 2, 8), (0, 3, 8), (0, 4, 8), (0, 8, 9), (0, 8, 10), (0, 8, 11) and (0, 8, 12).

Possible positions of (0, 8).

I keep repeating iterations until there's no more empty space but it's slow and uses a lot of memory. I'm storing the positions as bitboards. It uses a breadth-first search. My optimizations might prune too much too soon leading to not all combinations being found.

internal class ValidPositionGenerator
{
    // Used to store new bitboards we need to check
    private readonly Queue<ulong> bitBoards = new Queue<ulong>();

    // Used to store unique bitboards we have found
    private readonly HashSet<ulong> visited = [];

    public ValidPositionGenerator(int maxTiles)
    {

        int i = 0;

    // This corresponds to the first iteration
        while (i < 64)
        {
            bitBoards.Enqueue(CanonicalizeBitboard((ulong)Math.Pow(2, i)));
            i++;
        }

        GenerateValid(bitBoards, maxTiles);
    }

    private void GenerateValid(Queue<ulong> bitBoards, int maxTiles)
    {
        int i = 0;
        while (bitBoards.Count != 0)
        {
            ulong board = bitBoards.Dequeue();

            if (visited.Contains(board))
            {
                continue;
            }

            if (BitOperations.PopCount(board) > maxTiles)
            {
                continue;
            }

            visited.Add(board);

            GetNewPositions(board, bitBoards, visited);

            i++;
        }
    }


    private static void GetNewPositions(ulong board, Queue<ulong> bitBoards, HashSet<ulong> visited)
    {
        // Get the active bits from board
        List<int> indices = GetIndices(board);

        int i;
        ulong canonical;

        // Loop over each one and attempt to add each one to the output
        foreach (int index in indices)
        {
            (int, int) pair = GetXYPairFromIndex(index);
            // Up
            i = 1;
            while (pair.Item2 - i > -1)
            {
                // Check if the next bit is already set: Optimization 1
                if (Util.GetBitboardCell(board, pair.Item1, pair.Item2 - i) == true)
                {
                    break;
                }
                // Optimization 2, 3, and 4 happens inside CanonicalizeBitboard()
                canonical = CanonicalizeBitboard(SetBitboardBit(board, pair.Item1, pair.Item2 - i, true));
                if (!visited.Contains(canonical))
                {
                    bitBoards.Enqueue(canonical);
                }
                i++;
            }
            // I've removed the rest to keep the code shorter
        // Left
            // Right
            // Down
            }
        }
    }

    private static List<int> GetIndices(ulong board)
    {
        ulong temp = board;
        List<int> indices = new List<int>();
        int index = 0;
        while (board > 0)
        {
            if ((board & 1) == 1)
            {
                indices.Add(index);
            }
            board >>= 1;
            index++;
        }
        return indices;
    }

    // Convert index to an x, y pair
    private static (int, int) GetXYPairFromIndex(int index)
    {
        return (index % 8, index / 8);
    }
}

Optimization 1: Don't double check

If I find an already active position I can stop because I can continue from there. In the example below I'm on (0, 2) scanning for bits to the right of 0. When I reach 2 that is already active so I can stop. Then I continue down left before scanning bits around 2. From here I would continue to the right getting anything I missed. I haven't tested if this improves anything.

(0, 2)

I think an optimization is to create another bitboard with all places visited and stop when I find something already found. Because once I get to 2 it'll start scanning left towards 0 but it already found everything in that direction. I think I can merge this with the first optimization. It seems an extension of it.

Optimization 2: Don't store mirrors or rotations

Every combination has symmetry and I only need to keep one. canonicalizeBitboard checks against rotations and reflections which reduces search space. Instead of storing 16 possibilities in the first iteration I only need to store 3: 0, 1, and 5.

From 0, 1, or 5 you can get any other number by rotating or flipping the board. 0 rotated 90 gives 3. 1 flipped gives 2.

Optimization 3: Don't store similar translations

The two grids below are the same to me. So when I canonicalize the bitboard I'm also shifting it so it's flushed to the top left corner.

These are basically the same thing

Optimization 4: Condensed

Considering following two bitboards I only care about unique combinations of moving up, down, left or right. If I start from 0 I can move right and down in both of them so I only need one. I think I'll still need to keep it for when I check for new bitboards, but I don't need it in my output. I'm confused about this for my other optimizations as well. I think I should still check them because the position I can get from them might not be in their condensed version.

An original bitboard and its condensed version

Conclusion

That's all optimizations I can think of. All I'm doing is finding all possible combinations/permutations of up, down, left or right where the sum of left and right equals the size of the width of the bitboard - 1 and similarly with up and down, but with some extra rules. I think there might be a way as a string of up, down, left, and right that gets mapped to a canonical bitboard.

What I've tried

My code works. I'm wondering if there's a better way. When I call ValidPositionGenerator() with 6 as input it takes 6 seconds and half a gigabyte of memory. At 7 it takes 2 minutes and 16 gigabyte. The bitBoards queue blows up and that might be unavoidable. There are 2^64 possible bitboards but I'm hoping my optimizations makes that manageable.

Solution

I am not sure why you are using a breadth-first search. Usually it is easier to reason about depth-first algorithms (at least for me). Recursive depth-first back-tracking algorithms do not require a queue and the intermediate results are automatically stored temporarily on the stack.

If you stored the result as "RD" (i.e., right, down) instead of e.g. 0, 2, 14 in a HashSet<T>, then this would automatically eliminate duplicates of "original" versus "condensed" as well as duplicates given by translations. Also, since you are not interested in mirrored patterns, there is no need to consider other cells than your 0-cell as starting point. Other cells would allow you to start by going up or left and would result in a mirrored pattern compared to first going down or right.

I am not sure whether this exactly what you need, but here is my attempt:

internal class ValidPositionGenerator
{
    const int N = 4;

    private readonly bool[,] _board = new bool[N, N];
    private readonly StringBuilder _pattern = new();

    public HashSet<string> Positions { get; } = new();

    public void Generate(int x, int y)
    {
        _board[x, y] = true;

        // Move right
        if (x < N - 1 && !_board[x + 1, y]) {
            if (_pattern.Length == 0 || _pattern[^1] != 'R') {
                _pattern.Append('R');
                Positions.Add(_pattern.ToString());
                Generate(x + 1, y);
                _pattern.Length--;
            } else {
                Generate(x + 1, y);
            }
        }

        // Move down
        if (y < N - 1 && !_board[x, y + 1]) {
            if (_pattern.Length == 0 || _pattern[^1] != 'D') {
                _pattern.Append('D');
                Positions.Add(_pattern.ToString());
                Generate(x, y + 1);
                _pattern.Length--;
            } else {
                Generate(x, y + 1);
            }
        }

        // Move left
        if (x > 0 && !_board[x - 1, y]) {
            if (_pattern.Length == 0 || _pattern[^1] != 'L') {
                _pattern.Append('L');
                Positions.Add(_pattern.ToString());
                Generate(x - 1, y);
                _pattern.Length--;
            } else {
                Generate(x - 1, y);
            }
        }

        // Move up
        if (y > 0 && !_board[x, y - 1]) {
            if (_pattern.Length == 0 || _pattern[^1] != 'U') {
                _pattern.Append('U');
                Positions.Add(_pattern.ToString());
                Generate(x, y - 1);
                _pattern.Length--;
            } else {
                Generate(x, y - 1);
            }
        }

        _board[x, y] = false; // back-track
    }
}

I tested it with

var generator = new ValidPositionGenerator();
generator.Generate(0, 0);
foreach (string position in generator.Positions) {
    Console.WriteLine(position);
}

Note that this might generate moves like RDLU where the last U bumps into the starting point. This happens because the notation is condensed. The "real" moves should either be RDDLU or RDLLU. I don't know if you need to know this or whether knowing the turns is enough.

According to your comment you want the unpacked version. Of course there will many more unpacked versions than packed ones.

It can be achieved by replacing this (showing only one direction but would apply to all four):

// Move right
if (x < N - 1 && !_board[x + 1, y]) {
    if (_pattern.Length == 0 || _pattern[^1] != 'R') {
        _pattern.Append('R');
        Positions.Add(_pattern.ToString());
        Generate(x + 1, y);
        _pattern.Length--;
    } else {
        Generate(x + 1, y);
    }
}

// Move right
if (x < N - 1 && !_board[x + 1, y]) {
    _pattern.Append('R');
    Positions.Add(_pattern.ToString());
    Generate(x + 1, y);
    _pattern.Length--;
}

On a 4 x 4 grid 382 packed and 2110 unpacked positions are created. I.e., ~5.5 time more unpacked than packed.

For a 6 x 6 grid it gets worse: 352,098 packed vs. 31,811,176 unpacked. I.e., ~90 times more unpacked!