Search code examples
tensorflowscikit-learnneural-networkregressiontflearn

tflearn multi layer perceptron with unexpected prediction


I would like to rebuild a MLP I implemented first with scikit-learn's MLPRegressor with tflearn.

sklearn.neural_network.MLPRegressor implementation:

train_data = pd.read_csv('train_data.csv', delimiter = ';', decimal = ',', header = 0)
test_data = pd.read_csv('test_data.csv', delimiter = ';', decimal = ',', header = 0)

X_train = np.array(train_data.drop(['output'], 1))
X_scaler = StandardScaler()
X_scaler.fit(X_train)
X_train = X_scaler.transform(X_train)

Y_train = np.array(train_data['output'])

clf = MLPRegressor(activation = 'tanh', solver='lbfgs', alpha=0.0001, hidden_layer_sizes=(3))
clf.fit(X_train, Y_train)
prediction = clf.predict(X_train)

The model worked and I got an accuracy of 0.85. Now I would like to build a similar MLP with tflearn. I started with the following code:

train_data = pd.read_csv('train_data.csv', delimiter = ';', decimal = ',', header = 0)
test_data = pd.read_csv('test_data.csv', delimiter = ';', decimal = ',', header = 0)

X_train = np.array(train_data.drop(['output'], 1))
X_scaler = StandardScaler()
X_scaler.fit(X_train)
X_train = X_scaler.transform(X_train)

Y_train = np.array(train_data['output'])
Y_scaler = StandardScaler()
Y_scaler.fit(Y_train)
Y_train = Y_scaler.transform(Y_train.reshape((-1,1)))

net = tfl.input_data(shape=[None, 6])
net = tfl.fully_connected(net, 3, activation='tanh')
net = tfl.fully_connected(net, 1, activation='sigmoid')
net = tfl.regression(net, optimizer='sgd', loss='mean_square', learning_rate=3.)

clf = tfl.DNN(net)
clf.fit(X_train, Y_train, n_epoch=200, show_metric=True)
prediction = clf.predict(X_train)

At some point I definitely configured something the wrong way because the prediction is way off. The range of Y_train is between 20 and 88 and the prediction shows numbers around 0.005. In the tflearn documentation I just found examples for classification.

UPDATE 1:

I realized that the regression layer uses by default 'categorical_crossentropy' as loss-function which is for classification problems. So I selected 'mean_square' instead. I also tried to normalize Y_train. The prediction still not even matches the range of Y_train. Any thoughts?

FINAL UPDATE:

Take a look at the accepted answer.


Solution

  • I made a couple of actually really dumb mistakes.

    First of all I scalled the output to the interval 0 to 1 but used in the output-layer the activatuion function tanh which delivers values from -1 to 1. So I had to use either an activation function that outputs values between 0 and 1 (like e.g. sigmoid) or linear without any scaling applied.

    Secondly and most importantly, for my data I chose a pretty bad combination for learning rate and n_epoch. I didn't specify any learning rate and the default one is 0.1, I think. In any case it was too small (I end up using 3.0). At the same time the epoch count (10) was also far too small, with 200 it worked fine.

    I also explicitly chose sgd as optimizer (default: adam), which turned out to work a lot better.

    I updated the code in my question.