python tensorflow machine-learning deep-learning kaggle

Kaggle Titanic with tflearn neural network

I have solved the Titanic problem with logistic regression, now I want solve the problem with neural network. But my model always return 1, that means survived. for every test input. Maybe there is a problem in my model. How could I solve this?

train_data = pd.read_csv('data/train.csv')
test_data = pd.read_csv('data/test.csv')

#Some data cleaning process
#......


X_train = train_data.drop("Survived",axis=1).as_matrix()
Y_train = train_data["Survived"].as_matrix().reshape((891,1))
X_test  = test_data.drop("PassengerId",axis=1).as_matrix()


net = tflearn.input_data(shape=[None, 6])
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 1, activation='softmax')
net = tflearn.regression(net)
model = tflearn.DNN(net)
model.fit(X_train, Y_train, n_epoch=10, batch_size=16, show_metric=True)

pred = model.predict(X_test)
print pred

Solution

Using softmax as an activation layer in the output ensures that the sum of the outputs across all nodes in that layer is 1. Since you only have a single node, and the output has to sum to 1, it will always output 1 by definition.

You should never use softmax as your activation for a binary classification task. A better option is the logistic function, which I think tensorflow calls sigmoid.

So instead of

net = tflearn.fully_connected(net, 1, activation='softmax')

try

net = tflearn.fully_connected(net, 1, activation='sigmoid')