Search code examples
pythontensorflowone-hot-encoding

Exporting Tensorflow prediction to csv but the result contains all zeros - Is this because of one-hot ending ?


I am using Tensorflow framework for my classification predictions. My dataset contains around 1160 output classes. The output class values are 6 digit number. For example, 789954. After training and testing the dataset with Tensorflow, I got the accuracy of around 99%.

Now the second step is to get the prediction outcome in the csv file so that I can check the predicted outcomes(logits) match with original labels in the set. We know that logits are one hot encoded vectors for my . So, I have done the following steps in order to decode the one hot encoded.

prediction=tf.argmax(logits,1)
print(prediction.eval(feed_dict={features : test_features, keep_prob: 1.0}))
prediction = np.asarray(prediction.eval(feed_dict={features : test_features, keep_prob: 1.0}))

prediction = np.reshape(prediction, (test_features.shape[0],1))
np.savetxt("prediction.csv", prediction, delimiter=",") 

The resulted values in csv file is only 0.00E+00 for all entries. But my expectation is 6 digit codes for the respective csv entries. I guess I have gone somewhere wrong in my one-hot encoding.

Any help is appreciable.

Added : I have one hot encoded in this way.

labels = tf.one_hot(labels, n_classes)

And n_classes = 1160 and all the values will be 6 digit number


Solution

  • If each description has only one-label then your approach is fine. You use sklearn LabelEncoder to convert your categories to labels. Your label should be for each feature a value between [0 to 1160] and then do a on-hot encoding.