I looked for this error on stackoverflow and found several posts, but no one adressing this specific situation.
I have the following dataframe:
the inputvariables and outputvariables are defined in this code:
My goal is to guess the output values Q & r, based on the inputs (x). I have tried two ways and both are failing. With the first one, I tried a multi-ouput regressor.
I first split up the data in test and training data:
import numpy as np
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
y_train = y_train.ravel()
y_test = y_test.ravel()
then import the function:
from sklearn.multioutput import MultiOutputRegressor
and then try to predict Q & r:
reg= MultiOutputRegressor(estimator=100, n_jobs=None)
reg=reg.predict(X_train, y_train)
And this gives me the error:
TypeError: predict() takes 2 positional arguments but 3 were given
What am I doing wrong and how can I fix it?
Next thing I tried was a neural network. After assigning the columns x and y, I made the neural network:
# neural network class definition
class neuralNetwork:
#Step 1:
def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):
#set number of nodes in each input, hidden, output layer
self.inodes = inputnodes
self.hnodes = hiddennodes
self.onodes = outputnodes
#link weight matrices, wih and who (weights in hidden en output layers),
# we are going to create matrices for the multiplication of it to get an
# output
# weights inside the arrays (matrices) are w_i_j, where link is from node
# i to node j in the next layer
#w11 w21
#w12 w22 etc
self.wih = numpy.random.normal(0.0,pow(self.inodes,-0.5),( self.hnodes,
self.who = numpy.random.normal(0.0,pow(self.hnodes,-0.5),( self.onodes,
# setting the learning rate
self.lr = learningrate
# activation function is the sigmoid function
self.activation_function = lambda x: scipy.special.expit(x)
#Step 2:
def train(self, inputs_list, targets_list):
#convert input lists to 2d array (matrice)
inputs = numpy.array(inputs_list, ndmin=2).T
targets = numpy.array(targets_list, ndmin=2).T
#calculate signals into hidden layer
hidden_inputs = numpy.dot(self.wih, inputs)
#calculate signals emerging from hidden layer
hidden_outputs = self.activation_function(hidden_inputs)
#calculate signals into final output layer
final_inputs = numpy.dot(self.who, hidden_outputs)
#calculate signals emerging from final output layer
final_outputs = self.activation_function(final_inputs)
# output layer error is the (target-actual)
output_errors = targets -final_outputs
#hidden layer error is the output_errors, split by weights, recombined
at hidden nodes
hidden_errors = numpy.dot(self.who.T, output_errors)
#update the weights for the links between the hidden and output layers
self.who += self.lr * numpy.dot((output_errors*final_outputs * (1.0-
# update the weights for the links between the input and hidden layers
self.wih += self.lr*numpy.dot((hidden_errors*hidden_outputs*(1.0-
#Step 3
def query(self, inputs_list):
#convert input lists to 2d array (matrice)
inputs = numpy.array(inputs_list, ndmin=2).T
#calculate signals into hidden layer
hidden_inputs = numpy.dot(self.wih, inputs)
#calculate signals emerging from hidden layer
hidden_outputs = self.activation_function(hidden_inputs)
#calculate signals into final output layer
final_inputs = numpy.dot(self.who, hidden_outputs)
#calculate signals emerging from final output layer
final_outputs = self.activation_function(final_inputs)
return final_outputs
I then created an instance of a neural network:
#Creating instance of neural network
#number of input, hidden and output nodes
input_nodes = 8
hidden_nodes = 100
output_nodes = 2
#learning rate is 0.8
learning_rate = 0.8
#create instance of neural network
n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)
I got 8 inputs and 2 outputs that need to be predicted.
I then trained the neural network:
# train the neural network
# go through all records in the training data set
for record in df:
#scale and shift te inputs
inputs = x
#create the target output values
targets = y
n.train(inputs, targets)
and then I want to query the guessed outputs and now it goes wrong:
so i want to make 2 extra columns in the dataframe with the guesses of Q (Q*) & r (r*):
df["Q*","r*"] = n.query(x)
I really don't know how to do this properly. This above code gives me the error:
ValueError: Length of values does not match length of index
Any help appreciated.
Regarding the first part (MultiOutputRegressor
) of your question, there are several issues with your code...
To start with, the estimator
argument of MultiOutputRegressor
should not be a number, but, as the docs say:
estimator : estimator object
An estimator object implementing fit and predict.
So, for example, to use a random forest with default parameters, you should use
reg = MultiOutputRegressor(RandomForestRegressor())
(see this answer for some more examples)
Second, in your code you never fit your regressor; you should add
reg.fit(X_train, y_train)
after the definition.
Third, predict
doesn't take the ground truth values (y_train
here) as arguments, only the features (X_train
); from the docs again:
Predict multi-output variable using a model trained for each target variable.
Parameters: X : (sparse) array-like, shape (n_samples, n_features)
Returns: y : (sparse) array-like, shape (n_samples, n_outputs)
Multi-output targets predicted across multiple predictors. Note: Separate models are generated for each predictor.
Since you also pass y_train
in your code, you receive an expected error of one argument too many; simply change it to reg.predict(X_train)
, and you will be fine.