I am having troubles using scipy.minimize() in a logistic neuron training. My cost and gradient functions have been successfully tested.
scipy.minimize() sends me back "IndexError: too many indices for array". I am using method='CG', but that's the same with other methods.
res = minimize(loCostEntro, W, args=(XX,Y,lmbda), method='CG', jac=loGradEntro, options={'maxiter': 500})
W (weights), XX(training sets) and Y(result) are all numpy 2D arrays.
Please find below the code of the gradient and the cost functions:
def loOutput(X, W):
Z = np.dot(X, W)
O = misc.sigmoid(Z)
return O
def loCostEntro(W, X, Y, lmbda=0):
m = len(X)
O = loOutput(X, W)
cost = -1 * (1 / m) * (np.log(O).T.dot(Y) + np.log(1 - O).T.dot(1 - Y)) \
+ (lmbda / (2 * m)) * np.sum( np.square(W[1:]))
return cost[0,0]
def loGradEntro(W, X, Y, lmbda=0):
m = len(X)
O = loOutput(X, W)
GRAD = (1 / m) * np.dot(X.T, (O - Y)) + (lmbda / m) * np.r_[[[0]], W[1:].reshape(-1, 1)]
return GRAD
Thanks to this working example, I figured out what was wrong. The reason is that scipy.minimize() sends a 1D Weights array (W) to my Gradient and Cost functions whereas my functions supported only 2D arrays.
So reshaping W in the dot product as below fixed the issue :
def loOutput(X, W):
Z = np.dot(X, W.reshape(-1, 1)) # reshape(-1, 1) because scipy.minimize() sends 1-D W !!!
O = misc.sigmoid(Z)
return O
By the way, I encountered another similar problem after fixing this one. The Gradient function should return a 1D gradient. So I added :
def loGradEntroFlatten(W, X, Y, lmbda=0):
return loGradEntro(W, X, Y, lmbda).flatten()
and I updated :
res = minimize(loCostEntro, W, args=(XX,Y,lmbda), method='CG', jac=loGradEntroFlatten, options={'maxiter': 500})