I've written the following code to one-hot encode
a list of ints:
import numpy as np
a = np.array([1,2,3,4])
targets = np.zeros((a.size, a.max()))
targets[np.arange(a.size),a-1] = 1
targets
Output:
array([[1., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.]])
I would like to change the code, to better fit my ordinal class problem, so that the output would be:
array([[1., 0., 0., 0.],
[1., 1., 0., 0.],
[1., 1., 1., 0.],
[1., 1., 1., 1.]])
How can I achieve this?
Use broadcasted-comparison
-
(a[:,None]>np.arange(a.max())).astype(float)
Sample run -
In [47]: a = np.array([3,1,2,4]) # generic case of different numbers spread across
In [48]: (a[:,None]>np.arange(a.max())).astype(float)
Out[48]:
array([[1., 1., 1., 0.],
[1., 0., 0., 0.],
[1., 1., 0., 0.],
[1., 1., 1., 1.]])
If there are many entries and with a small range of numbers in a
, we can create all the possible combinations and then index with a
-offsetted one -
np.tri(a.max(), dtype=float)[a-1]
Sample run -
In [79]: a = np.array([3,1,2,4])
In [80]: np.tri(a.max(), dtype=float)[a-1]
Out[80]:
array([[1., 1., 1., 0.],
[1., 0., 0., 0.],
[1., 1., 0., 0.],
[1., 1., 1., 1.]])