algorithm graph signal-processing maze best-fit

How to Fix False Negative

Background:

I am making a program that detects the grid of a maze (like this). The way I do this is by getting the average color of each row/column and graphing it to locate the general grid lines (like this). With these grid lines I can group each row/column that is under the color threshold and map a line on the maze.

Problem:

What I am running into is a problem with certain mazes where there are no vertical lines. This will cause my algorithm to not detect a line and create errors as shown below.

Question:

What methods would you suggest for a fix to this problem?

Note: I was thinking something like pattern detection to fill in the missing data?

Solution

If your input maze is guaranteed to be based on a grid, like the images you show, then I would suggest a more deterministic approach.

It is probably sufficient to find one wall on each column. So instead of averaging all pixels in a column (which loses a lot of useful information), you can measure e.g. the longest consecutive list of black pixels. If this is much longer than the width of a wall, then you know it is the length of a wall and thus you know the column lies on a grid line.

When you have done this for all columns, you get a discrete graph instead and you can choose a value somewhere in the middle of each peak for the actual column line.

Some grid lines might not have vertical walls at all though, but you can easily interpolate these when you have found at least 3 grid lines.

Another approach would be performing some signal processing and find the period of your function, but I think simple interpolation would be easier to implement and understand.

Edit: The interpolation can be done in different ways. In the easiest case, you assume that at least one column has a "neighbour", i.e., two detected columns that are adjacent in the grid, and that you detect the first and last column.

In this case, all you need to do is find the smallest distance between neighbours to find the grid cell width. You can also compare it with the cell height and choose whichever is smaller. Then, apply this width between the first and last columns to get all the columns.

Another approach, if you can't make this assumption, is that you repeatedly apply every column you detect with the same period throughout the grid, counting from the front and from the back, like so:

|_ _ _ _|_ _ _ _ _ _| => |_ _ _ _|_ _ _ _|_ _| => |_ _|_ _|_ _|_ _|_ _|

and repeating until no more edits are being made.