Search code examples
pythonnumpyordinal

Numpy - create ordinal categories embedding


I've written the following code to one-hot encode a list of ints:

import numpy as np
a = np.array([1,2,3,4])

targets = np.zeros((a.size, a.max()))
targets[np.arange(a.size),a-1] = 1
targets

Output:

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

I would like to change the code, to better fit my ordinal class problem, so that the output would be:

array([[1., 0., 0., 0.],
       [1., 1., 0., 0.],
       [1., 1., 1., 0.],
       [1., 1., 1., 1.]])

How can I achieve this?


Solution

  • Use broadcasted-comparison -

    (a[:,None]>np.arange(a.max())).astype(float)
    

    Sample run -

    In [47]: a = np.array([3,1,2,4]) # generic case of different numbers spread across
    
    In [48]: (a[:,None]>np.arange(a.max())).astype(float)
    Out[48]: 
    array([[1., 1., 1., 0.],
           [1., 0., 0., 0.],
           [1., 1., 0., 0.],
           [1., 1., 1., 1.]])
    

    If there are many entries and with a small range of numbers in a, we can create all the possible combinations and then index with a-offsetted one -

    np.tri(a.max(), dtype=float)[a-1]
    

    Sample run -

    In [79]: a = np.array([3,1,2,4])
    
    In [80]: np.tri(a.max(), dtype=float)[a-1]
    Out[80]: 
    array([[1., 1., 1., 0.],
           [1., 0., 0., 0.],
           [1., 1., 0., 0.],
           [1., 1., 1., 1.]])