Search code examples
pythontensorflowmachine-learningdeep-learningkaggle

Kaggle Titanic with tflearn neural network


I have solved the Titanic problem with logistic regression, now I want solve the problem with neural network. But my model always return 1, that means survived. for every test input. Maybe there is a problem in my model. How could I solve this?

train_data = pd.read_csv('data/train.csv')
test_data = pd.read_csv('data/test.csv')

#Some data cleaning process
#......


X_train = train_data.drop("Survived",axis=1).as_matrix()
Y_train = train_data["Survived"].as_matrix().reshape((891,1))
X_test  = test_data.drop("PassengerId",axis=1).as_matrix()


net = tflearn.input_data(shape=[None, 6])
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 1, activation='softmax')
net = tflearn.regression(net)
model = tflearn.DNN(net)
model.fit(X_train, Y_train, n_epoch=10, batch_size=16, show_metric=True)

pred = model.predict(X_test)
print pred

Solution

  • Using softmax as an activation layer in the output ensures that the sum of the outputs across all nodes in that layer is 1. Since you only have a single node, and the output has to sum to 1, it will always output 1 by definition.

    You should never use softmax as your activation for a binary classification task. A better option is the logistic function, which I think tensorflow calls sigmoid.

    So instead of

    net = tflearn.fully_connected(net, 1, activation='softmax')
    

    try

    net = tflearn.fully_connected(net, 1, activation='sigmoid')