Search code examples
pythonnumpyzero-padding

Condition based array zero padding


I have two arrays:

 a = numpy.array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])
 label = numpy.array(['a', 'a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b'])

What I am looking for, is padding the zeros according to the following condition:

If the label[i-1] != label[i]:
   pad several zeros (say, 3) to the 'a' array at the same 'i' location

So, my desired result would be:

a = numpy.array([ 1,  2,  3,  4,  5,  6,  7, 0, 0, 0, 8,  9, 10])
label = numpy.array(['a', 'a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b'])

As you can see, array a now has 3 zeros after value 7, which were padded by the condition where label value has changed.

I have tried the following code:

for i in range(len(a)):
    if label[i-1] != label[i]:         
        a = numpy.pad(a, (0,3), 'constant')
    else:
       pass

But, the zeros are padded at the end of the a array. As I suspect, I should be equating padded operation to the same array, as it is changing within the for loop.


Solution

  • Here's a numpy based approach:

    def pad_at_diff(x, y, n):   
        # boolean mask where diffs occur 
        m = np.r_[False, y[:-1]!= y[1:]]
        # output array, expanded taking into account 
        # zeros to add
        x_pad = np.zeros(len(x)+n*len(m[m]))
        # assign at according indices adding cumsum of m
        x_pad[np.arange(len(x))+np.cumsum(m)*n] = x
        return x_pad
    

    a = np.array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])
    label = np.array(['a', 'a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b'])
    pad_at_diff(a, label, 3)
    array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  0.,  0.,  0.,  8.,  9., 10.])
    

    Or for this other example:

    a = np.array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10,11,12])
    label = np.array(['a', 'a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'c', 'c'])
    pad_at_diff(a, label, 3)
    array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  0.,  0.,  0.,  8.,  9., 10.,
            0.,  0.,  0., 11., 12.])