Search code examples
pythonnumpyrandom

create a numpy array with zeros and ones at random positions BUT at least one or more '1's on a given sub axis


I am looking to make a numpy array randomly initialized with zeros and ones, I found the following question here that descripbes the basic random array and how to control the dimensions. However, I need my array to have at least a single '1' on each sub array of the nested axis. See example:

import numpy as np    
size = (3, 5, 5)
proba_0 = 0.7
n_positions = np.random.choice([0,1], size=size, p=[proba_0, 1-proba_0])
print(n_positions)

[[[0 1 1 0 0]
  [0 0 0 0 0]
  [0 0 0 1 1]
  [1 0 0 0 0]
  [0 1 0 0 0]]

 [[0 1 1 1 1]
  [0 0 1 1 0]
  [1 0 0 1 1]
  [0 0 1 0 0]
  [0 1 0 1 1]]

 [[0 0 0 0 1]
  [0 0 1 0 1]
  [0 0 0 0 1]
  [0 0 0 1 0]
  [0 0 0 0 0]]]

This issue here is that at the following position in this array n_positions[0][1] the data is populted ONLY with zeros. I need there to be at least one '1' in each row on axis 2. I can increase the probability of 1s occuring but this doesn't eliminate the risk.

I could make this with a loop or a comprehension using a method getting numpy to generate a random numer of ones between 1-5 and then filling out with zeros but its very slow. I am hoping there is a more numpy friendly method built in to achieve this?


Solution

  • One solution (in case you only want to fill last axis. If you want to do this for all axes, then you need to repeat it for all axes, I guess)

    def fillLastAxis(arr):
        # Position in all but last axes of rows that need a 1
        pos=~arr.any(axis=-1) # True on rows coords that miss a 1
        # Number of 1 to generate
        num=pos.sum()
        # Index (along the last axis, of the missing 1)
        idx=np.random.randint(0, arr.shape[-1], num)
        # Just add a one at this pos
        arr[pos, idx]=1
    

    Testing:

    size = (3, 5, 5)
    proba_0 = 0.7
    n_positions = np.random.choice([0,1], size=size, p=[proba_0, 1-proba_0])
    print("==== Before ====")
    print(n_positions)
    fillLastAxis(n_positions)
    print("==== After ====")
    print(n_positions)
    

    Shows

    ==== Before ====
    [[[1 0 1 1 1]
      [1 0 0 0 1]
      [0 1 0 0 0]
      [0 0 0 0 0]
      [1 0 1 0 1]]
    
     [[0 0 1 0 1]
      [0 0 0 0 0]
      [0 0 0 0 1]
      [0 1 1 0 1]
      [0 0 0 0 0]]
    
     [[0 0 0 0 1]
      [0 0 0 1 1]
      [1 1 1 0 1]
      [0 1 0 0 0]
      [0 1 0 0 1]]]
    ==== After ====
    [[[1 0 1 1 1]
      [1 0 0 0 1]
      [0 1 0 0 0]
      [0 1 0 0 0]
      [1 0 1 0 1]]
    
     [[0 0 1 0 1]
      [0 0 1 0 0]
      [0 0 0 0 1]
      [0 1 1 0 1]
      [0 0 0 1 0]]
    
     [[0 0 0 0 1]
      [0 0 0 1 1]
      [1 1 1 0 1]
      [0 1 0 0 0]
      [0 1 0 0 1]]]
    

    You can see that row 3 of plane 0, and row 1 of plane 1 are missing a 1 before. And have one (at position 1 and 2) after.

    With higher dimension (to test with more axes, and reduce the probability that this was just luck). This times I force the seed, so that you can test at home, and check the result, without the need for me to print the whole array, since you have the same.

    size = (3, 6, 4, 5)
    proba_0 = 0.5
    np.random.seed(12)
    n_positions = np.random.choice([0,1], size=size, p=[proba_0, 1-proba_0])
    print("==== Before ====")
    print(n_positions)
    fillLastAxis(n_positions)
    print("==== After ====")
    print(n_positions)
    

    If I am correct, there are 3 all-0 lines in the result ([0,2,1,:], [2,0,3,:] and [2,4,3,:]). And they are all filled with one random 1 after.