I have two arrays:
a = numpy.array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
label = numpy.array(['a', 'a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b'])
What I am looking for, is padding the zeros according to the following condition:
If the label[i-1] != label[i]:
pad several zeros (say, 3) to the 'a' array at the same 'i' location
So, my desired result would be:
a = numpy.array([ 1, 2, 3, 4, 5, 6, 7, 0, 0, 0, 8, 9, 10])
label = numpy.array(['a', 'a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b'])
As you can see, array a
now has 3 zeros after value 7
, which were padded by the condition where label value has changed.
I have tried the following code:
for i in range(len(a)):
if label[i-1] != label[i]:
a = numpy.pad(a, (0,3), 'constant')
else:
pass
But, the zeros are padded at the end of the a
array. As I suspect, I should be equating padded operation to the same array, as it is changing within the for loop.
Here's a numpy based approach:
def pad_at_diff(x, y, n):
# boolean mask where diffs occur
m = np.r_[False, y[:-1]!= y[1:]]
# output array, expanded taking into account
# zeros to add
x_pad = np.zeros(len(x)+n*len(m[m]))
# assign at according indices adding cumsum of m
x_pad[np.arange(len(x))+np.cumsum(m)*n] = x
return x_pad
a = np.array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
label = np.array(['a', 'a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b'])
pad_at_diff(a, label, 3)
array([ 1., 2., 3., 4., 5., 6., 7., 0., 0., 0., 8., 9., 10.])
Or for this other example:
a = np.array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,11,12])
label = np.array(['a', 'a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'c', 'c'])
pad_at_diff(a, label, 3)
array([ 1., 2., 3., 4., 5., 6., 7., 0., 0., 0., 8., 9., 10.,
0., 0., 0., 11., 12.])