How to randomly keep values to a specific numbers and replacing rest with no data in 2d numpy array without changing anything others

I am new in scientific computing. I have a 2D numpy array(say, A) with shape as (11153L, 4218L), datatype is dtype('uint8') .Now, I want to keep data at some(say, 10000) random positions (row,col) and fill the rest with no-data-value- How can I do this?

Here no-data-value is got from another environmental variable e.g.my_raster_nodata_values = dsc.noDataValue

Solution

You could use np.random.choice with the optional arg replace set as False to select unique indices for the total size of that array and set those in it as no_data_value. Thus, an implementation would be -

a.ravel()[np.random.choice(a.size,a.size-10000,replace=0)] = no_data_value

Alternatively, we can use np.put as to make it more intuitive, like so -

np.put(a, np.random.choice(a.size,a.size-10000,replace=0), no_data_value)

A sample run should make it easier to understand -

In [94]: a     # Input array
Out[94]: 
array([[163,  80, 142, 169, 214],
       [  7,  59, 102, 104, 234],
       [ 44, 143,   7,  30, 232],
       [ 71,  15,  64,  42, 141]])

In [95]: no_data_value = 0  # No value specifier

In [98]: N = 10 # Number of elems to keep

In [99]: a.ravel()[np.random.choice(a.size,a.size-N,replace=0)] = no_data_value

In [100]: a
Out[100]: 
array([[  0,   0, 142,   0,   0],
       [  7,   0,   0, 104, 234],
       [  0,   0,   7,  30, 232],
       [ 71,   0,  64,   0, 141]])

If you already have one or more elements in the input array that are equal to no_data_value, we might want to offset the number of elements to be set based on that count. So, for such a case, we would have a modified version, like so -

S = a.size - N - (a == no_data_value).sum()
idx = np.random.choice(np.flatnonzero(a!=no_data_value),S,replace=0)
a.ravel()[idx] = no_data_value

Sample run -

In [65]: a
Out[65]: 
array([[240,  30,  61,  38, 145],
       [ 91,  65, 108, 154, 118],
       [155, 198,  65,  65, 189],
       [248, 140, 154, 186, 186]])

In [66]: no_data_value = 65  # No value specifier

In [67]: N = 10 # Number of elems to keep

In [68]: S = a.size - N - (a == no_data_value).sum()

In [69]: idx = np.random.choice(np.flatnonzero(a!=no_data_value),S,replace=0)

In [70]: a.ravel()[idx] = no_data_value

In [71]: a
Out[71]: 
array([[240,  30,  61,  38,  65],
       [ 65,  65, 108,  65,  65],
       [ 65, 198,  65,  65,  65],
       [248, 140, 154, 186,  65]])