I would like to rebuild a MLP I implemented first with scikit-learn's MLPRegressor with tflearn.
sklearn.neural_network.MLPRegressor implementation:
train_data = pd.read_csv('train_data.csv', delimiter = ';', decimal = ',', header = 0)
test_data = pd.read_csv('test_data.csv', delimiter = ';', decimal = ',', header = 0)
X_train = np.array(train_data.drop(['output'], 1))
X_scaler = StandardScaler()
X_scaler.fit(X_train)
X_train = X_scaler.transform(X_train)
Y_train = np.array(train_data['output'])
clf = MLPRegressor(activation = 'tanh', solver='lbfgs', alpha=0.0001, hidden_layer_sizes=(3))
clf.fit(X_train, Y_train)
prediction = clf.predict(X_train)
The model worked and I got an accuracy of 0.85
. Now I would like to build a similar MLP with tflearn. I started with the following code:
train_data = pd.read_csv('train_data.csv', delimiter = ';', decimal = ',', header = 0)
test_data = pd.read_csv('test_data.csv', delimiter = ';', decimal = ',', header = 0)
X_train = np.array(train_data.drop(['output'], 1))
X_scaler = StandardScaler()
X_scaler.fit(X_train)
X_train = X_scaler.transform(X_train)
Y_train = np.array(train_data['output'])
Y_scaler = StandardScaler()
Y_scaler.fit(Y_train)
Y_train = Y_scaler.transform(Y_train.reshape((-1,1)))
net = tfl.input_data(shape=[None, 6])
net = tfl.fully_connected(net, 3, activation='tanh')
net = tfl.fully_connected(net, 1, activation='sigmoid')
net = tfl.regression(net, optimizer='sgd', loss='mean_square', learning_rate=3.)
clf = tfl.DNN(net)
clf.fit(X_train, Y_train, n_epoch=200, show_metric=True)
prediction = clf.predict(X_train)
At some point I definitely configured something the wrong way because the prediction is way off. The range of Y_train is between 20
and 88
and the prediction shows numbers around 0.005
. In the tflearn documentation I just found examples for classification.
I realized that the regression layer uses by default 'categorical_crossentropy'
as loss-function which is for classification problems. So I selected 'mean_square'
instead. I also tried to normalize Y_train
. The prediction still not even matches the range of Y_train
. Any thoughts?
Take a look at the accepted answer.
I made a couple of actually really dumb mistakes.
First of all I scalled the output to the interval 0
to 1
but used in the output-layer the activatuion function tanh
which delivers values from -1
to 1
. So I had to use either an activation function that outputs values between 0
and 1
(like e.g. sigmoid
) or linear
without any scaling applied.
Secondly and most importantly, for my data I chose a pretty bad combination for learning rate
and n_epoch
. I didn't specify any learning rate and the default one is 0.1
, I think. In any case it was too small (I end up using 3.0
). At the same time the epoch count (10
) was also far too small, with 200
it worked fine.
I also explicitly chose sgd
as optimizer
(default: adam
), which turned out to work a lot better.
I updated the code in my question.