Search code examples
pythonarraysnumpyraster

Numpy: classify 2D array raster based on combinations of two 2D array rasters


I have two 2D numpy arrays representing raster images where classes 1, 2, 3 exist.

I need to obtain all possible combinations of these two arrays (should be 3 × 3 = 9 in this case) and create a new 2D array that is classified to 1-8 based on which combination exists at the spot.

There are also np.nan values as it is natural for rasters (must be MxN rectangle) but there is never a combination of np.nan and a valid category.

Example how I imagine that with small sample arrays:

Input arrays:

a = np.array([
    [1,2,3],
    [3,1,1]
])

b = np.array([
    [1,2,1],
    [3,3,3]
])

Possible combinations (imagine above arrays are much bigger and these combinations probably all occur):

possible_combinations = [
    (1,1), # class 1
    (2,1), # class 2
    (3,1), # class 3 
    (1,2), # class 4
    (1,3), # class 5
    (2,2), # class 6
    (2,3), # class 7
    (3,2), # class 8
    (3,3)  # class 9
]

Newly classified array:

>>> array(
       [[1, 6, 3],
        [9, 5, 5]]
    )

Edit

@Andrej Kesely's answer principally got it. One issue, I forgot to mention explicitly, arises when two nan values meet in the pair/pixel vector. This is an undesired class (and a combination as it does not make sense since np.nan != np.nan). To ditch these pairs we just need to modify the last step of the solution:

x = np.array(
    [[np.nan if all(np.isnan(pixel) for pixel in y) else possibl_combinations[tuple(y)] for y in x] for x in np.dstack((imgs[0], imgs[1]))]
)

Now it already becomes a pretty awful one-liner but still works.


Solution

  • You can use np.dstack to "zip" the two matrices.

    # to speed up, convert possible_combinations to dict:
    possible_combinations = {c: i for i, c in enumerate(possible_combinations, 1)}
    
    x = np.array(
        [[possible_combinations[tuple(y)] for y in x] for x in np.dstack((a, b))]
    )
    print(x)
    

    Prints:

    [[1 6 3]
     [9 5 5]]