Search code examples
python-2.7numpyscipysmoothingidl-programming-language

Replicate IDL 'smooth' in Python 2.7


I have been trying to work out how to replicate IDL's smooth function in Python and I just can't get anything like the same results. (Disclaimer: It is probably 10 years since I touched this kind of mathematical problem so it has been dumped to make way for information like where to find the cheapest local fuel). I am trying to code this:

smooth(b,w,/nan)

where b is a 2D float array containing NANs (zeros - missing data - have also been converted to NAN).

From the IDL documents, it appears smooth uses a boxcar, so from scipy.ndimage.filters I have tried:

bsmooth = uniform_filter(b, w)

I am aware that there are some fundamental differences here:

  1. the default edge behaviour from IDL is "the end points are copied from the original array to the result with no smoothing" whereas I don't seem to have the option to do this with the uniform filter.
  2. Treatment of the NaN elements. In IDL, the /nan keyword seems to mean that where possible the NaN values will be filled by the result of the other points in the window. If there are no valid points to generate a result, by a MISSING keyword. I thought I could approximate this behaviour following the smoothing using scipy.interpolate's NearestNDInterpolator (thanks to the brilliant explanation by Alex on here: filling gaps on an image using numpy and scipy)

Here is my test array:

 >>>b                                                                                                                array([[ 0.97599638,  0.93114936,  0.87070072,  0.5379253 ],                                                              
       [ 0.34873217,         nan,  0.40985891,  0.22407863],                                                              
       [        nan,         nan,         nan,  0.67532134],                                                              
       [        nan,         nan,  0.85441768,         nan]])  

My answers bore not the SLIGHTEST resemblance to IDL, whether I use the /nan keyword or not.

IDL> smooth(b,2,/nan)
      0.97599638      0.93114936      0.87070072      0.53792530
      0.34873217      0.70728749      0.60817236      0.22407863
             NaN      0.53766960      0.54091913      0.67532134
             NaN             NaN      0.85441768             NaN

IDL> smooth(b,2)
      0.97599638      0.93114936      0.87070072      0.53792530
      0.34873217            -NaN            -NaN      0.22407863
            -NaN            -NaN            -NaN      0.67532134
            -NaN            -NaN      0.85441768             NaN

I confess I find the scipy documentation rather sparse on detail so I have no idea if I am really doing what I think I doing. The fact that the two python approaches which I believed would both smooth the image give different answers suggests that things are not what I understood them to be.

>>>uniform_filter(b, 2)
array([[ 0.97599638,  0.95357287,  0.90092504,  0.70431301],
       [ 0.66236428,         nan,         nan,         nan],
       [        nan,         nan,         nan,         nan],
       [        nan,         nan,         nan,         nan]])    

I thought it was a bit odd it was so empty so I tried this with an array of 100 elements (still using a window of 2) and output the images. The results (first image is 'b' second is 'bsmooth') are not quite what I was hoping for:

![b

![bsmooth

Going back to the smaller array and following the examples in: http://scipy.github.io/old-wiki/pages/Cookbook/SignalSmooth which I thought would give the same output as uniform_filter, I tried:

>>> box = np.array([1,1,1,1])
>>> box = box.reshape(2,2)
>>> box
array([[1, 1],
       [1, 1]])
>>> bsmooth = scipy.signal.convolve2d(b,box,mode='same')
>>> print bsmooth
[[ 0.97599638  1.90714574  1.80185008  1.40862602]
 [ 1.32472855         nan         nan  2.04256356]
 [        nan         nan         nan         nan]
 [        nan         nan         nan         nan]]

Obviously I have completely misunderstood the scipy functions, maybe even the IDL one. If anyone can help me to replicate the IDL smooth function as closely as possible, I would be extremely grateful. I am under considerable time pressure to get a solution for this that doesn't rely on IDL and I am tossing a coin to decide whether to code the function from scratch or develop a very contagious illness.

How can I perform the same smoothing in python?


Solution

  • Since this is not something that is available in the python packages and because I saw the question asked several times during my research without satisfactory answers, here is how I solved the issue.

    Provided is a test version of my function that I'm off to tidy up. I am sure there will be better ways to do the things I have done as I'm still fairly new to Python - please do recommend any appropriate changes.

    Plots use autumn colourmap just because it allowed me to see the NaNs clearly.

    My results:

    IDL propagate
         0.033369284     0.067915268      0.96602046      0.85623550
          0.30435592             NaN             NaN       100.00000
          0.94065958             NaN             NaN      0.90966976
         0.018516513     0.044460904     0.051047217             NaN
    
    python propagate
    [[  3.33692829e-02   6.79152655e-02   9.66020487e-01   8.56235492e-01]
     [  3.04355923e-01              nan              nan   1.00000000e+02]
     [  9.40659566e-01              nan              nan   9.09669768e-01]
     [  1.85165123e-02   4.44609040e-02   5.10472165e-02              nan]]
    
    
    IDL replace
         0.033369284     0.067915268      0.96602046      0.85623550
          0.30435592      0.47452110       14.829881       100.00000
          0.94065958      0.33833817       17.002417      0.90966976
         0.018516513     0.044460904     0.051047217             NaN
    
    python replace
    [[  3.33692829e-02   6.79152655e-02   9.66020487e-01   8.56235492e-01]
     [  3.04355923e-01   4.74521092e-01   1.48298812e+01   1.00000000e+02]
     [  9.40659566e-01   3.38338177e-01   1.70024175e+01   9.09669768e-01]
     [  1.85165123e-02   4.44609040e-02   5.10472165e-02              nan]]
    

    My function:

    #!/usr/bin/env python
    
    # smooth.py  
    __version__ = 0.1
    
    # Version 0.1    29 Feb 2016    ELH    Test release
    
    import numpy as np
    import matplotlib.pyplot as mp
    
    def Smooth(v1, w, nanopt): 
    
        # v1 is the input 2D numpy array.
        # w is the width of the square window along one dimension
        # nanopt can be replace or propagate 
    
        '''
        v1 = np.array(
        [[3.33692829e-02, 6.79152655e-02, 9.66020487e-01, 8.56235492e-01], 
        [3.04355923e-01, np.nan        , 4.86013025e-01, 1.00000000e+02],
        [9.40659566e-01, 5.23314093e-01, np.nan        , 9.09669768e-01],
        [1.85165123e-02, 4.44609040e-02, 5.10472165e-02, np.nan ]])
        w = 2
        '''
    
        mp.imshow(v1, interpolation='None', cmap='autumn')
        mp.show()
    
        # make a copy of the array for the output:
        vout=np.copy(v1)
    
        # If w is even, add one
        if w % 2 == 0:
            w = w + 1
    
        # get the size of each dim of the input:
        r,c = v1.shape
    
        # Assume that w, the width of the window is always square.
        startrc = (w - 1)/2
        stopr = r - ((w + 1)/2) + 1
        stopc = c - ((w + 1)/2) + 1
    
        # For all pixels within the border defined by the box size, calculate the average in the window.
        # There are two options:
        # Ignore NaNs and replace the value where possible.
        # Propagate the NaNs
    
        for col in range(startrc,stopc):
            # Calculate the window start and stop columns
            startwc = col - (w/2) 
            stopwc = col + (w/2) + 1
            for row in range (startrc,stopr):
                # Calculate the window start and stop rows
                startwr = row - (w/2)
                stopwr = row + (w/2) + 1
                # Extract the window
                window = v1[startwr:stopwr, startwc:stopwc]
                if nanopt == 'replace':
                    # If we're replacing Nans, then select only the finite elements
                    window = window[np.isfinite(window)]
                # Calculate the mean of the window
                vout[row,col] = np.mean(window)
    
        mp.imshow(vout, interpolation='None', cmap='autumn')
        mp.show()
    
        return vout