I am currently new to NumPy, but very proficient with SQL.
I used a function called coalesce
in SQL, which I was disappointed not to find in NumPy. I need this function to create a third array want
from 2 arrays i.e. array1
and array2
, where zero/ missing observations in array1
are replaced by observations in array2
under the same address/Location. I can't figure out how to use np.where
?
Once this task is accomplished, I would like to take the lower diagonal of this array want
and then populate a final array want2
noting the first non-zero observation. If all observations i.e. coalesce(array1, array2)
returns missing or 0 in want2
, then assign by default zero.
I have written an example demonstrating the desired behavior.
import numpy as np
array1= np.array(([-10,0,20],[-1,0,0],[0,34,-50]))
array2= np.array(([10,10,50],[10,0,25],[50,45,0]))
# Coalesce array1,array2 i.e. get the first non-zero value from array1, then from array2.
# if array1 is empty or zero, then populate table want with values from array2 under same address
want=np.tril(np.array(([-10,10,20],[-1,0,25],[50,34,-50])))
print(array1)
print(array2)
print(want)
# print first instance of nonzero observation from each column of table want
want2=np.array([-10,34,-50])
print(want2)
"Coalesce": use putmask
to replace values equal to zero with values from array2
:
want = array1.copy()
np.putmask(array1.copy(), array1==0, array2)
First nonzero element of each column of np.tril(want)
:
where_nonzero = np.where(np.tril(want) != 0)
"""For the where array, get the indices of only
the first index for each column"""
first_indices = np.unique(where_nonzero[1], return_index=True)[1]
# Get the values from want for those indices
want2 = want[(where_nonzero[0][first_indices], where_nonzero[1][first_indices])]