My intention was to solve the l1 error fitting problem using scipy.optimize as suggested in L1 norm instead of L2 norm for cost function in regression model, but I keep getting the wrong solution, so I debugged using least square which we know how to get a closed-form expression:
n=10
d=2
A = np.random.rand(n,d)
x = np.random.rand(d,1)
b = np.dot(A, x)
x_hat = np.linalg.lstsq(A, b)[0]
print(np.linalg.norm(np.dot(A, x_hat)-b))
def fit(X, params):
return X.dot(params)
def cost_function(params, X, y):
return np.linalg.norm(y - fit(X, params))
x0 = np.zeros((d,1))
output = minimize(cost_function, x0, args=(A, b))
y_hat = fit(A, output.x)
print(np.linalg.norm(y_hat-b))
print(output)
The output is:
4.726604209672303e-16
2.2714597315189407
fun: 2.2714597315189407
hess_inv: array([[ 0.19434496, -0.1424377 ],
[-0.1424377 , 0.16718703]])
jac: array([ 3.57627869e-07, -2.98023224e-08])
message: 'Optimization terminated successfully.'
nfev: 32
nit: 4
njev: 8
status: 0
success: True
x: array([0.372247 , 0.32633966])
which looks super weird because it can't even solve the l2 regression? Am I making stupid mistakes here?
The error is due to mismatching shapes. Minimize's x0
and the iterates have shape (n,)
. So when params
has shape (n,)
, fit(X,params)
will be shape (n,)
while your y
has shape (n,1)
. Thus, the expression y-fit(X,params)
has shape (n,n)
due to numpy's implicit broadcasting. And in consequence np.linalg.norm
returns the frobenius matrix norm instead of the euclidean vector norm.
Solution: change your cost function to:
def cost_function(params, X, y):
return np.linalg.norm(y - fit(X, params.reshape(d,1)))