I am looking to make a numpy array randomly initialized with zeros and ones, I found the following question here that descripbes the basic random array and how to control the dimensions. However, I need my array to have at least a single '1' on each sub array of the nested axis. See example:
import numpy as np
size = (3, 5, 5)
proba_0 = 0.7
n_positions = np.random.choice([0,1], size=size, p=[proba_0, 1-proba_0])
print(n_positions)
[[[0 1 1 0 0]
[0 0 0 0 0]
[0 0 0 1 1]
[1 0 0 0 0]
[0 1 0 0 0]]
[[0 1 1 1 1]
[0 0 1 1 0]
[1 0 0 1 1]
[0 0 1 0 0]
[0 1 0 1 1]]
[[0 0 0 0 1]
[0 0 1 0 1]
[0 0 0 0 1]
[0 0 0 1 0]
[0 0 0 0 0]]]
This issue here is that at the following position in this array n_positions[0][1]
the data is populted ONLY with zeros. I need there to be at least one '1' in each row on axis 2. I can increase the probability of 1s occuring but this doesn't eliminate the risk.
I could make this with a loop or a comprehension using a method getting numpy to generate a random numer of ones between 1-5 and then filling out with zeros but its very slow. I am hoping there is a more numpy friendly method built in to achieve this?
One solution (in case you only want to fill last axis. If you want to do this for all axes, then you need to repeat it for all axes, I guess)
def fillLastAxis(arr):
# Position in all but last axes of rows that need a 1
pos=~arr.any(axis=-1) # True on rows coords that miss a 1
# Number of 1 to generate
num=pos.sum()
# Index (along the last axis, of the missing 1)
idx=np.random.randint(0, arr.shape[-1], num)
# Just add a one at this pos
arr[pos, idx]=1
Testing:
size = (3, 5, 5)
proba_0 = 0.7
n_positions = np.random.choice([0,1], size=size, p=[proba_0, 1-proba_0])
print("==== Before ====")
print(n_positions)
fillLastAxis(n_positions)
print("==== After ====")
print(n_positions)
Shows
==== Before ====
[[[1 0 1 1 1]
[1 0 0 0 1]
[0 1 0 0 0]
[0 0 0 0 0]
[1 0 1 0 1]]
[[0 0 1 0 1]
[0 0 0 0 0]
[0 0 0 0 1]
[0 1 1 0 1]
[0 0 0 0 0]]
[[0 0 0 0 1]
[0 0 0 1 1]
[1 1 1 0 1]
[0 1 0 0 0]
[0 1 0 0 1]]]
==== After ====
[[[1 0 1 1 1]
[1 0 0 0 1]
[0 1 0 0 0]
[0 1 0 0 0]
[1 0 1 0 1]]
[[0 0 1 0 1]
[0 0 1 0 0]
[0 0 0 0 1]
[0 1 1 0 1]
[0 0 0 1 0]]
[[0 0 0 0 1]
[0 0 0 1 1]
[1 1 1 0 1]
[0 1 0 0 0]
[0 1 0 0 1]]]
You can see that row 3 of plane 0, and row 1 of plane 1 are missing a 1 before. And have one (at position 1 and 2) after.
With higher dimension (to test with more axes, and reduce the probability that this was just luck). This times I force the seed, so that you can test at home, and check the result, without the need for me to print the whole array, since you have the same.
size = (3, 6, 4, 5)
proba_0 = 0.5
np.random.seed(12)
n_positions = np.random.choice([0,1], size=size, p=[proba_0, 1-proba_0])
print("==== Before ====")
print(n_positions)
fillLastAxis(n_positions)
print("==== After ====")
print(n_positions)
If I am correct, there are 3 all-0 lines in the result ([0,2,1,:]
, [2,0,3,:]
and [2,4,3,:]
). And they are all filled with one random 1 after.