Search code examples
pythonscipylogistic-regressionminimize

training a logistic neuron with scipy.minimize()


I am having troubles using scipy.minimize() in a logistic neuron training. My cost and gradient functions have been successfully tested.

scipy.minimize() sends me back "IndexError: too many indices for array". I am using method='CG', but that's the same with other methods.

res = minimize(loCostEntro, W, args=(XX,Y,lmbda), method='CG', jac=loGradEntro, options={'maxiter': 500})

W (weights), XX(training sets) and Y(result) are all numpy 2D arrays.

Please find below the code of the gradient and the cost functions:

def loOutput(X, W):
   Z = np.dot(X, W)
   O = misc.sigmoid(Z)
   return O


def loCostEntro(W, X, Y, lmbda=0):
   m = len(X)
   O = loOutput(X, W)
   cost = -1 * (1 / m) * (np.log(O).T.dot(Y) + np.log(1 - O).T.dot(1 - Y)) \
       + (lmbda / (2 * m)) * np.sum( np.square(W[1:]))
   return cost[0,0]

def loGradEntro(W, X, Y, lmbda=0):
    m = len(X)
    O = loOutput(X, W)
    GRAD = (1 / m) * np.dot(X.T, (O - Y)) + (lmbda / m) * np.r_[[[0]], W[1:].reshape(-1, 1)]
    return GRAD

Solution

  • Thanks to this working example, I figured out what was wrong. The reason is that scipy.minimize() sends a 1D Weights array (W) to my Gradient and Cost functions whereas my functions supported only 2D arrays.

    So reshaping W in the dot product as below fixed the issue :

    def loOutput(X, W):
       Z = np.dot(X, W.reshape(-1, 1))    # reshape(-1, 1) because scipy.minimize() sends 1-D W !!!
       O = misc.sigmoid(Z)
       return O
    

    By the way, I encountered another similar problem after fixing this one. The Gradient function should return a 1D gradient. So I added :

    def loGradEntroFlatten(W, X, Y, lmbda=0):
        return loGradEntro(W, X, Y, lmbda).flatten()
    

    and I updated :

    res = minimize(loCostEntro, W, args=(XX,Y,lmbda), method='CG', jac=loGradEntroFlatten, options={'maxiter': 500})