Search code examples
pythonmachine-learningscikit-learnneural-networkmlp

TypeError: predict() takes 2 positional arguments but 3 were given


I looked for this error on stackoverflow and found several posts, but no one adressing this specific situation.

I have the following dataframe:

df

the inputvariables and outputvariables are defined in this code:

xcol=["h","o","p","d","ddlt","devdlt","sl","lt"]
ycol=["Q","r"]
x=df[xcol].values
y=df[ycol].values

My goal is to guess the output values Q & r, based on the inputs (x). I have tried two ways and both are failing. With the first one, I tried a multi-ouput regressor.

I first split up the data in test and training data:

import numpy as np
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
y_train = y_train.ravel()
y_test = y_test.ravel()

then import the function:

from sklearn.multioutput import MultiOutputRegressor

and then try to predict Q & r:

reg= MultiOutputRegressor(estimator=100, n_jobs=None)
reg=reg.predict(X_train, y_train)

And this gives me the error:

TypeError: predict() takes 2 positional arguments but 3 were given

What am I doing wrong and how can I fix it?

Next thing I tried was a neural network. After assigning the columns x and y, I made the neural network:

# neural network class definition
class neuralNetwork:

#Step 1: 
def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):
    #set number of nodes in each input, hidden, output layer
    self.inodes = inputnodes
    self.hnodes = hiddennodes
    self.onodes = outputnodes

    #link weight matrices, wih and who (weights in hidden en output layers), 
    # we are going to create matrices for the multiplication of it to get an 
    # output
    # weights inside the arrays (matrices) are w_i_j, where link is from node 
    # i to node j in the next layer
    #w11 w21
    #w12 w22 etc
    self.wih = numpy.random.normal(0.0,pow(self.inodes,-0.5),( self.hnodes, 
    self.inodes))
    self.who = numpy.random.normal(0.0,pow(self.hnodes,-0.5),( self.onodes, 
    self.hnodes))

    # setting the learning rate
    self.lr = learningrate

    # activation function is the sigmoid function
    self.activation_function = lambda x: scipy.special.expit(x)

    pass

    #Step 2:
def train(self, inputs_list, targets_list):
    #convert input lists to 2d array (matrice)
    inputs = numpy.array(inputs_list, ndmin=2).T
    targets = numpy.array(targets_list, ndmin=2).T

    #calculate signals into hidden layer
    hidden_inputs = numpy.dot(self.wih, inputs)
    #calculate signals emerging from hidden layer
    hidden_outputs = self.activation_function(hidden_inputs)

    #calculate signals into final output layer
    final_inputs = numpy.dot(self.who, hidden_outputs)
    #calculate signals emerging from final output layer
    final_outputs = self.activation_function(final_inputs)
    # output layer error is the (target-actual)
    output_errors = targets -final_outputs
    #hidden layer error is the output_errors, split by weights, recombined 
    at hidden nodes
    hidden_errors = numpy.dot(self.who.T, output_errors)

    #update the weights for the links between the hidden and output layers
    self.who += self.lr * numpy.dot((output_errors*final_outputs * (1.0- 
    final_outputs)),numpy.transpose(hidden_outputs))

    # update the weights for the links between the input and hidden layers
    self.wih += self.lr*numpy.dot((hidden_errors*hidden_outputs*(1.0- 
    hidden_outputs)),numpy.transpose(inputs))

    pass

    #Step 3
def query(self, inputs_list):
    #convert input lists to 2d array (matrice)
    inputs = numpy.array(inputs_list, ndmin=2).T

    #calculate signals into hidden layer
    hidden_inputs = numpy.dot(self.wih, inputs)
    #calculate signals emerging from hidden layer
    hidden_outputs = self.activation_function(hidden_inputs)

    #calculate signals into final output layer
    final_inputs = numpy.dot(self.who, hidden_outputs)
    #calculate signals emerging from final output layer
    final_outputs = self.activation_function(final_inputs)

    return final_outputs

I then created an instance of a neural network:

   #Creating instance of neural network 

   #number of input, hidden and output nodes
   input_nodes = 8
   hidden_nodes = 100
   output_nodes = 2

   #learning rate is 0.8
   learning_rate = 0.8

   #create instance of neural network
   n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)

I got 8 inputs and 2 outputs that need to be predicted.

I then trained the neural network:

# train the neural network
# go through all records in the training data set 
for record in df:
    #scale and shift te inputs
    inputs = x
    #create the target output values 
    targets = y
    n.train(inputs, targets)
    pass

and then I want to query the guessed outputs and now it goes wrong:

so i want to make 2 extra columns in the dataframe with the guesses of Q (Q*) & r (r*):

df["Q*","r*"] = n.query(x)

I really don't know how to do this properly. This above code gives me the error:

ValueError: Length of values does not match length of index

Any help appreciated.

Steven


Solution

  • Regarding the first part (MultiOutputRegressor) of your question, there are several issues with your code...

    To start with, the estimator argument of MultiOutputRegressor should not be a number, but, as the docs say:

    estimator : estimator object

    An estimator object implementing fit and predict.

    So, for example, to use a random forest with default parameters, you should use

    reg = MultiOutputRegressor(RandomForestRegressor()) 
    

    (see this answer for some more examples)

    Second, in your code you never fit your regressor; you should add

    reg.fit(X_train, y_train)
    

    after the definition.

    Third, predict doesn't take the ground truth values (y_train here) as arguments, only the features (X_train); from the docs again:

    predict(X)

    Predict multi-output variable using a model trained for each target variable.

    Parameters: X : (sparse) array-like, shape (n_samples, n_features)

    Data.

    Returns: y : (sparse) array-like, shape (n_samples, n_outputs)

    Multi-output targets predicted across multiple predictors. Note: Separate models are generated for each predictor.

    Since you also pass y_train in your code, you receive an expected error of one argument too many; simply change it to reg.predict(X_train), and you will be fine.