Search code examples
pythonnumpyone-hot-encoding

One Hot Encoding using numpy


If the input is zero I want to make an array which looks like this:

[1,0,0,0,0,0,0,0,0,0]

and if the input is 5:

[0,0,0,0,0,1,0,0,0,0]

For the above I wrote:

np.put(np.zeros(10),5,1)

but it did not work.

Is there any way in which, this can be implemented in one line?


Solution

  • Usually, when you want to get a one-hot encoding for classification in machine learning, you have an array of indices.

    import numpy as np
    nb_classes = 6
    targets = np.array([[2, 3, 4, 0]]).reshape(-1)
    one_hot_targets = np.eye(nb_classes)[targets]
    

    The one_hot_targets is now

    array([[[ 0.,  0.,  1.,  0.,  0.,  0.],
            [ 0.,  0.,  0.,  1.,  0.,  0.],
            [ 0.,  0.,  0.,  0.,  1.,  0.],
            [ 1.,  0.,  0.,  0.,  0.,  0.]]])
    

    The .reshape(-1) is there to make sure you have the right labels format (you might also have [[2], [3], [4], [0]]). The -1 is a special value which means "put all remaining stuff in this dimension". As there is only one, it flattens the array.

    Copy-Paste solution

    def get_one_hot(targets, nb_classes):
        res = np.eye(nb_classes)[np.array(targets).reshape(-1)]
        return res.reshape(list(targets.shape)+[nb_classes])
    

    Package

    You can use mpu.ml.indices2one_hot. It's tested and simple to use:

    import mpu.ml
    one_hot = mpu.ml.indices2one_hot([1, 3, 0], nb_classes=5)