I have a large sparse matrix whose each row contains multiple nonzero elements, for example
a = np.array([[1, 1,0,0,0,0], [2,0, 1,0,2,0], [3,0,4,0,0, 3]])
I want to be able to randomly select one nonzero element per row without for loop. Any good suggestion? As output, I am more interested in chosen elements' index than its value.
With a numpy
array
such as:
arr = np.array([5, 2, 6, 0, 2, 0, 0, 6])
you can do arr != 0
which will give a True
/ False
array
of values which pass the condition so in our case, where the values are not equal (!=
) to 0
. So:
array([ True, True, True, False, True, False, False, True], dtype=bool)
from here, we can 'index'
arr
with this boolean
array
by doing arr[arr != 0
] which gives us:
array([5, 2, 6, 2, 6])
So now that we have a way of removing the non-zero
values from a numpy
array
, we can do a simple list comprehension
on each row
in your a
array
. For each row
, we remove the zeros
and then perform a random.choice
on the array
. As so:
np.array([np.random.choice(r[r!=0]) for r in a])
which gives you back an array of length
3
containing random
non-zero
items from each row
in a
. :)
Hope this helps!
Update
If you want the indexes
of the random
non-zero
numbers in the array
, you can use .nonzero()
.
So if we have this array
:
arr = np.array([5, 2, 6, 0, 2, 0, 0, 6])
we can do:
arr.nonzero()
which gives a tuple
of the indexes
of non-zero
elements
:
(array([0, 1, 2, 4, 7]),)
so as with before, we can use this and np.random.choice()
in a list-comprehension
to produce random indexes
:
a = np.array([[1, 1, 0, 0, 0, 0], [2, 0, 1, 0, 2, 0], [3, 0, 4, 0, 0, 3]])
np.array([np.random.choice(r.nonzero()[0]) for r in a])
which returns an array
of the form [x, y, z]
where x
, y
and z
are random
indexes
of non-zero
elements from their corresponding rows
.
E.g. one result could be:
array([1, 4, 2])
And if you want it to also return the rows
, you could just add in a numpy.arrange()
call on the length of a
to get an array
of row
numbers:
([np.arange(len(a))], np.array([np.random.choice(r.nonzero()[0]) for r in a]))
so an example random
output could be:
([array([0, 1, 2])], array([1, 2, 5]))
for a
as:
array([[1, 1, 0, 0, 0, 0],
[2, 0, 1, 0, 2, 0],
[3, 0, 4, 0, 0, 3]])
Hope this does what you want now :)