I am performing non-negative least squares using scipy. A trivial example would be as follows:
import numpy as np
from scipy.optimize import nnls
A = np.array([[60, 70, 120, 60],[60, 90, 120, 70]], dtype='float32')
b = np.array([6, 5])
x, res = nnls(A, b)
Now, I have a situation where some entries in A
or b
can be missing (np.NaN
). Something like,
A_2 = A.copy()
A_2[0,2] = np.NaN
Ofcourse, running NNLS on A_2, b will not work as scipy does not expect an inf
or nan
.
How can we perform NNLS masking out the missing entry from the computation. Effectively, this should translate to
Minimize |(A_2.x- b)[mask]|
where mask can be defined as:
mask = ~np.isnan(A_2)
In general, entries can be missing from both A
and b
.
Possibly helpful:
[1] How to include constraint to Scipy NNLS function solution so that it sums to 1
I think you can compute the mask first (determine which points you want included) and then perform NNLS. Given the mask
In []: mask
Out[]:
array([[ True, True, False, True],
[ True, True, True, True]], dtype=bool)
you can verify whether to include a point by checking if all values in a column are True
using np.all
along the first axis.
In []: np.all(mask, axis=0)
Out[]: array([ True, True, False, True], dtype=bool)
This can then be used as a column mask for A
.
In []: nnls(A_2[:,np.all(mask, axis=0)], b)
Out[]: (array([ 0.09166667, 0. , 0. ]), 0.7071067811865482)
The same idea can be used for b
to construct a row mask.