Search code examples
pythonnumpyunit-testing

Compare two boolean arrays considering a tolerance


I have two boolean arrays, first and second that should be mostly equal (up to a tolerance). I would like to compare them in a way that is forgiving if a few elements are different.

Something like np.array_equal(first, second, equal_nan=True) is too strict because all values must be the same and np.allclose(first, second, atol=tolerance, equal_nan=True) is not suitable for comparing booleans.

The following case should succeed:

tolerance = 1e-5
seed = np.random.rand(100, 100, 100)
first = seed > 0.5
second = (seed > 0.5) & (seed < 1. - 1e-6) # 99.9999% overlap in true elements

The following case should fail:

first = seed > 0.5
second = (seed > 0.5) & (seed < 1. - 1e-4) # 99.99% overlap in true elements

The following case should also fail:

first = seed > 0.5
second = first[::-1] # first.sum() == second.sum(), but they are not similar

How can I handle this case in an elegant manner?


Solution

  • Simon's answer was pretty close to what I needed. However, I preferred using a volume overlapping metric (eg dice and IoU) instead of normalizing by the size of the voxel. Dice and IoU range [0, 1], which is pretty convenient in this case, with 0 meaning no overlap and 1 meaning perfect overlap.

    Dice implementation:

    tolerance = 1e-5
    dice = 2 * (first & second).sum() / (first.sum() + second.sum())
    if 1 - dice > tolerance:
        raise
    
    

    IoU implementation:

    iou = (first & second).sum() / (first | second).sum()
    if 1 - iou > tolerance:
        raise