Search code examples
pythonpython-3.xnumpyscipynumpy-ndarray

Select multiple columns from array, multiple times


Hi I have the following setup:

from scipy

def _bootstrap_resample(sample, n_resamples=None, random_state=None):
    """Bootstrap resample the sample."""
    n = sample.shape[-1]

    # bootstrap - each row is a random resample of original observations
    i = rng_integers(random_state, 0, n, (n_resamples, n))

    resamples = sample[..., i]
    return resamples

in my case:

sample:

[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]]

i:

[[0 0 0 0 0 1 1 1 1 1]
 [2 2 2 2 2 3 3 3 3 3]]

what i want:

[[[ 0  0  0  0  0  1  1  1  1  1]
  [10 10 10 10 10 11 11 11 11 11]]

 [[ 2  2  2  2  2  3  3  3  3  3]
  [12 12 12 12 12 13 13 13 13 13]]]

i.e. multiple rows of i are just supposed to say which columns to take and create new examples. The provided code:

resamples = sample[..., i]

does not do that unfortunately and produces

[[[ 0  0  0  0  0  1  1  1  1  1]
  [ 2  2  2  2  2  3  3  3  3  3]]

 [[10 10 10 10 10 11 11 11 11 11]
  [12 12 12 12 12 13 13 13 13 13]]]

How can I obtain what I want here?


Solution

  • The exact expected logic is not fully clear, but you might want to moveaxis:

    resamples = np.moveaxis(sample[..., i], -2, 0)
    

    Output:

    array([[[ 0,  0,  0,  0,  0,  1,  1,  1,  1,  1],
            [10, 10, 10, 10, 10, 11, 11, 11, 11, 11]],
    
           [[ 2,  2,  2,  2,  2,  3,  3,  3,  3,  3],
            [12, 12, 12, 12, 12, 13, 13, 13, 13, 13]]])
    

    generalization

    If you have sample of shape (a, b, c, ..., z) and i of shape (n, z), this will give you an output of shape (n, a, b, c, ..., z).

    Example:

    sample = np.arange(3*4*5*10).reshape((3, 4, 5, -1))+100
    # shape: (3, 4, 5, 10)
    
    n = sample.shape[-1]
    # 10
    
    n_resamples = 20
    # bootstrap - each row is a random resample of original observations
    rng = np.random.default_rng()
    i = rng.integers(0, n, (n_resamples, n))
    # shape: (20, 10)
    
    resamples = np.moveaxis(sample[..., i], -2, 0)
    # shape: (20, 3, 4, 5, 10)