Search code examples
pythonpython-3.xnumpypython-xarray

3D connected components with periodic boundary conditions


I am trying to identify connected features on the globe (i.e., in space-time on a sphere). The cc3d package has gotten me 90% of the way but I am struggling to deal with the date border (i.e., the periodic boundary conditions in one of the three dimensions).

On a 2D map, the problem of my approach becomes apparent (note the connected buy differently labelled region around 0 longitude near the South Pole):

![enter image description here

This appears because the data are defined on longitudes from 0 to 360 (while I show shown them from -180 to 180 here to make the problem more apparent).

My two typical solutions for these kinds of problems do not work:

  • Flipping the anti-meridian to the pacific just shifts the problem and therefore does not help.
  • Concatenating a copy of the data at right hand boundary does also not help because it leads to ambiguity in the labels between the original left side and the pasted data on the right

MWE

A break-down of the problem in 2 dimensions should be the following:

import cc3d
import numpy as np
import matplotlib.pyplot as plt

arr = np.sin(np.random.RandomState(0).randn(100).reshape(10, 10)) > .3
labels = cc3d.connected_components(arr)
map = plt.pcolormesh(labels, vmin=.5)
map.cmap.set_under('none')

enter image description here

Here the yellow structure in the right top should be connected to the rest of the top structure and same for the two structures at the bottom. Please keep in mind that any helpful solution should also work for connected features in 3 dimensions.

Any help on how to approach this is appreciated.


Solution

  • Update April 2024: the option for periodic boundary conditions is now implemented in cc3d directly (see comment by SapphireSun below).


    Okay I have a fix for the output. Its quite slow and requires to run connected_components twice but works for me. I'm putting it here in case its helpful for anyone later.

    I'll put the fix for the MWE here but it works equivalently for full 3D space-time data (independently for each day). The convention for the data I'm working with is that the longitude is the last dimension, so (time, latitude, longitude) but it is plotted on the x-axis, therefore I do a swapaxes for all function calls in the MWE example.

    Functions

    def plot(arr):
        map = plt.pcolormesh(arr, vmin=.5)
        map.cmap.set_under('none')    
     
        for xx in range(len(arr)):
            for yy in range(len(arr)):
                plt.text(yy+.5, xx+.5, arr[xx, yy],
                         ha="center", va="center", color="k")
    
    
    def cut_stitch(arr):  # equivalent to switching the antimeridian [0, 360[ -> [-180, 180[
        idx = len(arr) // 2
        return np.concatenate([arr[idx:], arr[:idx]], axis=0)
    
    
    def fix_date_border(arr, arr_r):
        mask = arr == 0
    
        mapping = {}
        for key, value in zip(arr_r[~mask], arr[~mask]):
            try:
                mapping[key].add(value)
            except KeyError:
                mapping[key] = {value}  
        # if there is only one unique value per key its a one-to-one mapping and no problem
        conflicts = [list(values) for values in mapping.values() if len(values) != 1]
        
        print(mapping)  # DEBUG
        
        for conflict in conflicts:
            idx = np.any([arr == cc for cc in conflict[1:]], axis=0)
            arr[idx] = conflict[0]  # reassign in original (i.e., first) array
            
        return arr
    

    Run component labelling a second time on the rearranged field, then rearrange again:

    arr2 = cut_stitch(arr.swapaxes(0, 1)).swapaxes(0, 1)
    labels2 = cut_stitch(cc3d.connected_components(arr2).swapaxes(0, 1)).swapaxes(0, 1)
    plot(labels)  # original
    plot(labels2)
    

    enter image description here enter image description here

    Fix the labels:

    labels_fixed = fix_date_border(labels.swapaxes(0, 1), labels2.swapaxes(0, 1)).swapaxes(0, 1)
    # {2: {1, 2}, 7: {8, 7}, 5: {4}, 3: {3}, 1: {1}, 4: {5}, 8: {7}, 6: {6}}
    plot(labels_fixed, labels=True)
    

    enter image description here

    Note that the labels will no longer be strictly consecutive in the fixed array!