Search code examples
arraysperformancealgorithmpseudocode

How to find open rectangle with specific size efficiently?


Background

I have a rectangular area divided up into squares (this is for a game I'm making). I'm representing this in my code as a simple two-dimensional boolean array:

  ┌──┬──┬──┬──┬──┐
  │  │  │  │  │  │ X = X-value, also increasing width
  ├──┼──┼──┼──┼──┤ Y = Y-value, also increasing length
  │  │  │  │  │  │
  ├──┼──┼──┼──┼──┤
  │  │  │  │  │  │
  ├──┼──┼──┼──┼──┤
  │  │  │  │  │  │
  ├──┼──┼──┼──┼──┤
^ │  │  │  │  │  │
Y └──┴──┴──┴──┴──┘
   X >

Some of the squares can be taken up in rectangles by buildings, etc. like this input (** = taken):

┌──┬──┬──┬──┬──┐
│**│**│  │  │  │ 3 taken rectangles
├──┼──┼──┼──┼──┤
│**│**│**│**│  │
├──┼──┼──┼──┼──┤
│  │  │**│**│  │
├──┼──┼──┼──┼──┤
│  │  │**│**│  │
├──┼──┼──┼──┼──┤
│**│  │**│**│  │
└──┴──┴──┴──┴──┘

In the 2D boolean array, "taken" squares are set to true and "open, non-taken" squares are set to false.

My problem

I need to find all "open" rectangles (non-taken) that are of a specific size. This is because I need to find all the possible spaces to put the next building. For example, from the previous figure, if I wanted to get all "open" 1x2 rectangles I should get these outputs:

┌──┬──┬──┬──┬──┐
│**│**│1 │12│2 │ 3 taken rectangles (input)
├──┼──┼──┼──┼──┤ 4 open 1x2 rectangles (output)
│**│**│**│**│  │ Open rectangles are numbered
├──┼──┼──┼──┼──┤ 
│3 │3 │**│**│  │ Rectangles can overlap.
├──┼──┼──┼──┼──┤ The '12' square represents the overlapping
│4 │4 │**│**│  │ of rectangle 1 and 2.
├──┼──┼──┼──┼──┤ 
│**│  │**│**│  │ (Haha, my diagram kind of looks like old Minesweeper)
└──┴──┴──┴──┴──┘

What I've done

Here's I've tested (brute-force search, code is C# and a bit of pseudocode ):

List<Rectangle> findOpenSpaces(boolean[][] area, int width, int length) {
    List<Rectangle> openSpaces = new List<Rectangle>();
    boolean open = true;
    // Loop through all rectangles with size "width" and "length"
    for x = 0 to (total area length) - length {
        for y = 0 to (total area width) - width {
            // Check this rectangle to see if any squares are taken
            Rectangle r = new Rectangle(x, y, width, length);
            if checkRectangle(area, r) returns true {
                // rectangle was fully open, add to the list
                openSpaces.Add(r);
            }
        }
    }
    return openSpaces;
}

boolean checkRectangleIsOpen(boolean[][] area, Rectangle r) {
    for i = r.x to r.width {
        for j = r.y to r.length {
            if area[i][j] is true {
                // means that this square in the rectangle is taken,
                // thus the rectangle is not fully open
                // so return false (not open)
                return false;
            }
        }
    }
    return true; // no square in the rectangle was taken
}

struct Rectangle { // just a struct to help store values
    int x, y, width, length;
}

Question

The above (pseudo)code works, but if you look at the it, it's time comes to O(n^4) :( (I think, because of four nested for loops, but I'm no expert). Plus, in the the game the total size rectangular area is 50x50, not 5x5 like my examples here. The game lags visibly whenever I do this operation.

I Googled this, but I'm not exactly sure what to call this problem. I'd really appreciate if anyone could show me an efficient algorithm to do this task. Thanks :)


Solution

  • Some quick thinking will show that you simply cannot do this any faster than O(NM), because there are at least that many possible outputs. You already have an O(N^2M^2) solution, so let's see if we can find a O(NM).

    What I would do is, instead of finding locations that can form a rectangle of size a x b , find locations that cannot.

    That is to say, something like this (pseudocode):

    for each taken square:
        for each rectangle starting position:
            if rectangle at starting position includes this taken square,
                mark starting position as non-tenable
    

    If your rectangles are small (like the 1x2 example you had above), this (efficiently implemented) is perfectly sufficient. However, for an approach that doesn't rely on the size of the rectangles for asymptotic speed . . .

    Consider that (x,y) is taken. Then the rectangles that overlap with this point form a rectangle themselves, and nothing in that range can be used. So we want to mark a (discrete) range as not usable. This can be done with a 2-D Fenwick tree, which has cost O(log N log M) per update or query. To mark a range (x1, y1, x2, y2) as "in use", we:

    1. Increment (x1, y1) by one
    2. Decrement (x1, y2) by one
    3. Decrement (x2, y1) by one
    4. Increment (x2, y2) by one

    So we end up doing 4 2D Fenwick tree operations per used square, of which there are at most O(NM). Then, when we're done, we check each the value of each possible starting position: if it's zero, it's a valid place to start. If it's nonzero, it cannot be used.

    Total cost: O(NM log N log M). While you can likely do this faster, the speed/size ratio of implementation for this approach is extremely good (a 2-D Fenwick tree is roughly ten lines to implement).