Search code examples
arrayspython-3.xnumpyrepeat

Creating repeating values using numpy.repeat


I am Python newbie and trying to replicate an R script of the following form in Python.

# set value for k
k <- 3

# script 1 R
cnt <- c(1,-1, rep(0,k-2))

print(cnt)
1 -1  0

# script 2 R
for (i in 2:(k-1)) {
  cnt <- c(cnt, c(rep(0,(i-1)),1,-1,rep(0,(k-i-1))))
    }

print(cnt)
1 -1  0  0  1 -1

In the script 2 the outcome of script 1 prefixed via concatenation. Given that Python doesn't have a direct equivalent to Rs rep() function, this is what I have attempted to do, using numpy.repeat.

For R script 1, I did the following, which got me close to the desired outcome 1 -1 0,

# code 1 Python
pcnt = np.array([1, -1, np.repeat('0', k-2)], dtype=int)

print(pcnt)
[ 1 -1  0]

but with a DeprecationWarning: setting an array element with a sequence. This was supported in some cases where the elements are arrays with a single element. For example `np.array([1, np.array([2])], dtype=int)`. In the future this will raise the same ValueError as `np.array([1, [2]], dtype=int).

For R script 2 I tried the following, but excluded the concatenation part with the intention of concatenating them after.

# code 2 Python
for i in range(2, k-1+1):
    pcnt2 = np.array([np.repeat(0, (i-1)), 1, -1, np.repeat(0, (k-i-1))], dtype='int')
print(pcnt2)

The above code raises a ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (4,) + inhomogeneous part.

However, when I change dtype='object' like so

for i in range(2, k-1+1):
    pcnt3 = np.array([np.repeat(0, (i-1)), 1, -1, np.repeat(0, (k-i-1))], dtype='object')
print(pcnt3)

I get:

array([array([0]), 1, -1, array([], dtype=int32)], dtype=object)

What I need help with is how to get two separate arrays from both Python codes, with code 1 Python resulting in the following output 1 -1 0 and code 2 Python outputting 0 1 -1


Solution

  • Your repeat makes an array.

    If we try to make a new array with integers and an array, we get:

    In [23]: np.array([1,2,np.array([3,4])])
    <ipython-input-23-77cecd77c763>:1: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
      np.array([1,2,np.array([3,4])])
    Out[23]: array([1, 2, array([3, 4])], dtype=object)
    

    Note that the result is a 3 element object dtype array; what the warning calls a 'ragged' sequence.

    If we specify the int dtype, I get (in the latest verison) the error that your warning was talking about:

    In [24]: np.array([1,2,np.array([3,4])],dtype=int)
    Traceback (most recent call last):
      Input In [24] in <cell line: 1>
        np.array([1,2,np.array([3,4])],dtype=int)
    ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (3,) + inhomogeneous part.
    

    I suspect what you want is to concatenate, join the integers and the array into one flat array. hstack makes that easy:

    In [25]: np.hstack([1,2,np.array([3,4])])
    Out[25]: array([1, 2, 3, 4])
    

    or

    In [26]: np.concatenate([[1,2],np.array([3,4])])
    Out[26]: array([1, 2, 3, 4])
    

    Your R code uses c, which from its docs is a concatenate, objects to be concatenated.

    np.array is not concatenate, though for certain lists of lists it does act like a concatenate, but on a new axis, as in the classic:

    In [27]: np.array([ [1,2], [3,4]])
    Out[27]: 
    array([[1, 2],
           [3, 4]])
    In [28]: np.stack([ [1,2], [3,4]])
    Out[28]: 
    array([[1, 2],
           [3, 4]])