I'm trying to implement the numpy pad function in theano for the constant mode. How is it implemented in numpy? Assume that pad values are just 0.
Given an array
a = np.array([[1,2,3,4],[5,6,7,8]])
# pad values are just 0 as indicated by constant_values=0
np.pad(a, pad_width=[(1,2),(3,4)], mode='constant', constant_values=0)
would return
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 0],
[0, 0, 0, 5, 6, 7, 8, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
Now if I know the number of dimensions of a beforehand, I can just implement this by creating a new array of the new dimensions filled the pad value and fill in the corresponding elements in this array. But what if I don't know the dimensions of the input array? While I can still infer the dimensions of the output array from the input array, I have no way of indexing it without knowing the number of dimensions in it. Or am I missing something?
That is, if I know that the input dimension is say, 3, then I could do:
zeros_array[pad_width[0][0]:-pad_width[0][1], pad_width[1][0]:-pad_width[1][1], pad_width[2][0]:-pad_width[2][1]] = a
where zeros array is the new array created with the output dimensions.
But if I don't know the ndim before hand, I cannot do this.
My instinct is to do:
def ...(arg, pad):
out_shape = <arg.shape + padding> # math on tuples/lists
idx = [slice(x1, x2) for ...] # again math on shape and padding
res = np.zeros(out_shape, dtype=arg.dtype)
res[idx] = arg # may need tuple(idx)
return res
In other words, make the target array, and copy the input with the appropriate indexing tuple. It will require some math and maybe iteration to construct the required shape and slicing, but that should be straight forward if tedious.
However it appears that np.pad
iterates on the axes (if I've identified the correct alternative:
newmat = narray.copy()
for axis, ((pad_before, pad_after), (before_val, after_val)) \
in enumerate(zip(pad_width, kwargs['constant_values'])):
newmat = _prepend_const(newmat, pad_before, before_val, axis)
newmat = _append_const(newmat, pad_after, after_val, axis)
where _prepend_const
is:
np.concatenate((np.zeros(padshape, dtype=arr.dtype), arr), axis=axis)
(and append
would be similar). So it is adding each pre and post piece separately for each dimension. Conceptually that is simple even if it might not be the fastest.
In [601]: np.lib.arraypad._prepend_const(np.ones((3,5)),3,0,0)
Out[601]:
array([[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.]])
In [604]: arg=np.ones((3,5),int)
In [605]: for i in range(2):
...: arg=np.lib.arraypad._prepend_const(arg,1,0,i)
...: arg=np.lib.arraypad._append_const(arg,2,2,i)
...:
In [606]: arg
Out[606]:
array([[0, 0, 0, 0, 0, 0, 2, 2],
[0, 1, 1, 1, 1, 1, 2, 2],
[0, 1, 1, 1, 1, 1, 2, 2],
[0, 1, 1, 1, 1, 1, 2, 2],
[0, 2, 2, 2, 2, 2, 2, 2],
[0, 2, 2, 2, 2, 2, 2, 2]])