Search code examples
pythonnumpymultidimensional-array

fill nan values with mean or interpolated values in a multidimensional array


I have a big multidimensional array here which has many nan values.

I want to calculate either the mean value along first axis (5124) either interpolate values.

import numpy as np

data = np.load('data.npy')

mean = np.nanmean(data, axis=1)

Now, mean has shape: 5124, 112 and data : 5124, 112, 112, so I am trying:

data[np.any(np.isnan(data))][-1, :, :, -1] = mean

but data is still full of nan values.

I am not sure how to fill the mean values into the data array.

I tried some interpolation method but is very very slow and memory consuming, so I don't know if there is a better method to fill the nan values.


Solution

  • A bit slow but you can try:

    data2 = np.nan_to_num(data) + np.isnan(data) * mean[:, None]
    

    As suggested by @Stef, you can use bottleneck to speed up the process:

    # pip install bottleneck
    import bottleneck as bn
    
    mean = bn.nanmean(data, axis=1)
    data2 = bn.replace(data, np.nan, 0) + np.isnan(data) * mean[:, None]
    

    Output:

    >>> data2
    array([[[277.02652, 276.253  , 276.36276, ..., 272.2693 , 271.90436,
             271.64706],
            [277.02652, 276.253  , 276.36276, ..., 272.2693 , 271.90436,
             271.64706],
            [277.02652, 276.253  , 276.36276, ..., 272.2693 , 271.90436,
             271.64706],
            ...,
            [277.12585, 275.2982 , 275.56424, ..., 272.2693 , 271.90436,
             271.64706],
            [277.12585, 275.2982 , 275.56424, ..., 272.2693 , 271.90436,
             271.64706],
            [275.11438, 274.27878, 275.17032, ..., 272.2693 , 271.90436,
             271.64706]],
    
           [[277.4939 , 277.1011 , 277.1529 , ..., 271.71024, 271.51944,
             271.41312],
            [277.4939 , 277.1011 , 277.1529 , ..., 271.71024, 271.51944,
             271.41312],
            [277.4939 , 277.1011 , 277.1529 , ..., 271.71024, 271.51944,
             271.41312],
            ...,
            [277.50073, 276.22455, 276.3818 , ..., 271.71024, 271.51944,
             271.41312],
            [277.50073, 276.22455, 276.3818 , ..., 271.71024, 271.51944,
             271.41312],
            [275.67734, 275.02505, 275.5379 , ..., 271.71024, 271.51944,
             271.41312]],
    
           [[280.99646, 280.2319 , 280.23727, ..., 272.60663, 272.44424,
             272.37073],
            [280.99646, 280.2319 , 280.23727, ..., 272.60663, 272.44424,
             272.37073],
            [280.99646, 280.2319 , 280.23727, ..., 272.60663, 272.44424,
             272.37073],
            ...,
            [281.111  , 279.33786, 279.4811 , ..., 272.60663, 272.44424,
             272.37073],
            [281.111  , 279.33786, 279.4811 , ..., 272.60663, 272.44424,
             272.37073],
            [279.05643, 277.9778 , 278.5424 , ..., 272.60663, 272.44424,
             272.37073]],
    
           ...,
    
           [[299.3109 , 298.8816 , 299.19708, ..., 291.41086, 290.98898,
             290.52472],
            [299.3109 , 298.8816 , 299.19708, ..., 291.41086, 290.98898,
             290.52472],
            [299.3109 , 298.8816 , 299.19708, ..., 291.41086, 290.98898,
             290.52472],
            ...,
            [299.31546, 298.22787, 298.71487, ..., 291.41086, 290.98898,
             290.52472],
            [299.31546, 298.22787, 298.71487, ..., 291.41086, 290.98898,
             290.52472],
            [297.89618, 297.6253 , 298.5444 , ..., 291.41086, 290.98898,
             290.52472]],
    
           [[298.63446, 298.0783 , 298.38287, ..., 291.76425, 291.33102,
             290.88724],
            [298.63446, 298.0783 , 298.38287, ..., 291.76425, 291.33102,
             290.88724],
            [298.63446, 298.0783 , 298.38287, ..., 291.76425, 291.33102,
             290.88724],
            ...,
            [298.59354, 297.55087, 298.04398, ..., 291.76425, 291.33102,
             290.88724],
            [298.59354, 297.55087, 298.04398, ..., 291.76425, 291.33102,
             290.88724],
            [297.37015, 297.07132, 297.82864, ..., 291.76425, 291.33102,
             290.88724]],
    
           [[297.76532, 297.54745, 297.9761 , ..., 292.29636, 291.87845,
             291.44046],
            [297.76532, 297.54745, 297.9761 , ..., 292.29636, 291.87845,
             291.44046],
            [297.76532, 297.54745, 297.9761 , ..., 292.29636, 291.87845,
             291.44046],
            ...,
            [297.75772, 296.86356, 297.46335, ..., 292.29636, 291.87845,
             291.44046],
            [297.75772, 296.86356, 297.46335, ..., 292.29636, 291.87845,
             291.44046],
            [296.43677, 296.22397, 297.20667, ..., 292.29636, 291.87845,
             291.44046]]], dtype=float32)