Search code examples
pythonmachine-learningkerasnlpone-hot-encoding

One-hot encoding labels for binary text classification that are already 0s and 1s?


I am doing a simple binary text classification, and my label data are already in the format 0 and 1. I am wondering if I still need to perform a one-hot encoding so that they're in a [0,1] and [1,0] format?

When I feed the labels into my Keras Sequential() model as <class 'numpy.ndarray'> , it works for the model and I get decent accuracy. But I still wonder if I should one-hot encode them beforehand?


Solution

  • It should not be helpful for binary cases because a binary column already has two values. If you encode a binary to two columns, you will add one more extra binary column to columns that is not informative.

    Therefore, it is not meaningful to hot-encode a binary column and causes a not useful redundancy in your context.