I am exploring some of the numpy/scipy functions and I noticed that scipy.optimize.fmin_bfgs requires a change in the function being called to give correct results compared to a straight function call. My first definition of the fnRSS
function returns a correct value when calling the function but refuses to work in the optimization; my second definition gives the wrong result when calling the function but the right one when running the optimization. Can someone tell me what is so crucial about transposing the vY
parameter for the optimization ? It should already be 164x1.
import numpy as np
import scipy as sp
import pandas as pd
from scipy import optimize
if __name__ == "__main__":
urlSheatherData = "http://www.stat.tamu.edu/~sheather/book/docs/datasets/MichelinNY.csv"
data = pd.read_csv(urlSheatherData)
Xs = np.vstack(data[['Service','Decor', 'Food', 'Price']].values)
Xs = np.concatenate((np.vstack(np.ones(Xs.shape[0])),Xs), axis=1)
Ys = np.vstack(data[['InMichelin']].values)
# optimal solution (given)
vBeta = np.array([-1.49209249, -0.01117662, 0.044193, 0.05773374, 0.00179794]).reshape(5,1)
print Ys.shape, Xs.shape, vBeta.shape
# first definition of function
def fnRSS(vBeta, vY, mX):
return np.sum((vY - np.dot(mX, vBeta))**2)
print fnRSS(vBeta, Ys, Xs) # correct value
print np.linalg.lstsq(Xs, Ys)[1] # confirm correct value
print sp.optimize.fmin_bfgs(fnRSS, x0=vBeta, args=(Ys,Xs)) # wrong value
# second definition
def fnRSS(vBeta, vY, mX):
return np.sum((vY.T - np.dot(mX, vBeta))**2)
print fnRSS(vBeta, Ys, Xs) # incorrect value
print sp.optimize.fmin_bfgs(fnRSS, x0=vBeta, args=(Ys,Xs)) # correct convergence but simple call gives different value
My output:
(164, 1) (164, 5) (5, 1)
26.3239061505
[ 26.32390615]
Warning: Desired error not necessarily achieved due to precision loss.
Current function value: 6660.000000
Iterations: 39
Function evaluations: 3558
Gradient evaluations: 480
[ 4.51220111e-01 1.32711255e-07 8.09143368e-08 -1.06633003e-07
-5.18448332e-08]
9002.87916028
Warning: Desired error not necessarily achieved due to precision loss.
Current function value: 26.323906
Iterations: 29
Function evaluations: 1954
Gradient evaluations: 260
[-1.49209095 -0.0111764 0.04419313 0.05773347 0.00179789]
It's not about vY.T
, but vBeta
, i.e. x
being passed by fmin_bfgs
to fnRSS
as not a 2d vector but a 1d array. So despite the fact you explicetely try to specify x0=vBeta
to be an array of shape (5,1), it is converted internally to 1d array of shape (5, ), and in the end is returned as such.