python numpy random vectorization sample

Efficiently draw random samples without replacement from an array in python

I need to draw random samples without replacement from a 1D NumPy array. However, performance is critical since this operation will be repeated many times.

Here’s the code I’m currently using:

import numpy as np

# Example array
array = np.array([10, 20, 30, 40, 50])

# Number of samples to draw
num_samples = 3

# Draw samples without replacement
samples = np.random.choice(array, size=num_samples, replace=False)

print("Samples:", samples)

While this works for one sample, it requires a loop to generate multiple samples, and I believe there could be a way to optimize or vectorize this operation to improve performance when sampling multiple times.

Is there a way to vectorize or otherwise optimize this operation?
Would another library (e.g., TensorFlow, PyTorch) provide better performance for this task?
Are there specific techniques for bulk sampling that avoid looping in Python?

Solution

The code below generates random samples of a list without replacement in a vectorized manner. This solution is particularly useful when the number of simulations is large and the number of samples per simulation is low.

import numpy as np

def draw_random_samples(len_deck, n_simulations, n_cards):
    
    """
    Draw random samples from the deck.
    
    Parameters
    ----------
    len_deck : int
        Length of the deck.
    n_simulations : int
        How many combinations of cards are generated. (Doubles could occur.)
    n_cards : int
        How many cards to draw from the deck per simulation. (All cards are unique.)
    
    Returns
    -------
    indices : array-like
        Random indices of the deck. 
    
    """
    
    indices = np.random.randint(0, len_deck, (1, n_simulations))

    for i in range(1, n_cards):
        new_indices = np.random.randint(0, len_deck-i, n_simulations)
        new_indices += np.sum(new_indices >= indices - np.arange(i)[:,None], axis=0)

        indices = np.vstack((indices, new_indices))
        indices = np.sort(indices, axis=0)
    
    return indices.T