C# flood fill but with threshold allowing similar colors?

Using an image-generation AI I'm getting centered objects on a dark background. My goal is to convert all pixels outside this object to transparent. I figured a good-enough approach would be to flood-fill from all 4 corners using a fuzzy threshold, so that similar colors are erased too. But using e.g. the following recursive approach causes a StackOverflow:

static void FillPixels(Color[][] pixels, int x, int y, Color originColor, Color fillColor, float threshold)
{
    int width  = pixels.Length;
    int height = pixels[0].Length;

    bool isInLimits = x >= 0 && x < width && y >= 0 && y < height;
    if (isInLimits && ColorDistance(pixels[x][y], originColor) <= threshold)
    {
        pixels[x][y] = fillColor;

        FillPixels(pixels, x - 1, y, originColor, fillColor, threshold);
        FillPixels(pixels, x + 1, y, originColor, fillColor, threshold);
        FillPixels(pixels, x, y - 1, originColor, fillColor, threshold);
        FillPixels(pixels, x, y + 1, originColor, fillColor, threshold);
    }
}

The images are up to 1024x1024 pixels in size. The specific background color is unknown -- I can instruct the image AI to make it black, but it will usually not be a precise rgb(0,0,0) -- so I'm initially color-picking dynamically on each corner. What can be done to flood fill with a threshold, or otherwise find a good mask for the object to erase its background? Thanks!

Solution

The first thing to check is that the distance between fillColor to originColor is larger than the threshold.

An alternative would be to keep an explicit record of visited nodes, either with a bool[][] or a HashSet<(int x, int y)>.

Next thing would be to move to a iterative algorithm. Since this is images the worst case stack depth would be width*height. This is unlikely to occur, but the actual depth might get large enough for a stackoverflow. Changing to a explicit stack should be very easy, something like:

var stack = new Stack<(int x, int y)>();
stack.Push((x, y));
while (stack.Count > 0)
{
    var (x, y) = stack.Pop();
    // insert logic
    if(...){
        stack.Push((x + 1, y));
        ...
    }
}

I would also consider using some better data types. A multidimensional array, i.e. Color[,] would be better, then at least you know all rows have the same length. But in image processing it is fairly common to use raw data, i.e. byte[] or Span<byte>, and calculate pixel indices by hand: var indexToFirstPixelByte = y * span + x * bytesPerPixel, where span is the number of bytes in a row. Then you can fetch the bytes for your color directly. This should save time/memory since a Color-struct is much larger than the 4 bytes required for ARGB. Using a Point or other type to represent a pair of x,y coordinates is probably also good idea.