Search code examples
pythonnumpyfunctional-programming

Convert from flat one-hot encoding to list of indices with variable length


I have a list of integers, which I convert to variable one-hot encoding. For example, consider a list:

l = [(4, func1), (3, func2), (6, func3)]

The list's meaning is: The function object func1 accepts integers in the range 0..3, func2 can be called as func2(0), func2(1), or func2(3), etc.

This list is turned into a list of 4 + 3 + 6 = 13 boolean values. Yes, this is not quite the classical one-hot encoding. That means that the return value:

one_hot = [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

should result in the function call

func1(1)

Another example, if one_hot contains

one_hot = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]

the resulting function call should be

func3(5)

I'm now looking for an efficient, elegant solution to turn the list of boolean values into a function call. Is there one that uses NumPy's functions in an elegant way instead of creating an explicit loop?


Solution

  • What about:

    func1 = lambda x: 1*x
    func2 = lambda x: 2*x
    func3 = lambda x: 3*x
    
    l = [(4, func1), (3, func2), (6, func3)]
    
    one_hot = np.array([0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
    
    r = [t[0] for t in l]
    a = np.repeat(np.arange(len(l)), r)
    # array([0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 2])
    
    param = np.arange(len(one_hot)) - np.repeat(np.r_[0, np.cumsum(r)[:-1]], r)
    # array([0, 1, 2, 3, 0, 1, 2, 0, 1, 2, 3, 4, 5])
    
    idx = np.where(one_hot==1)[0]
    # array([ 1,  5,  9, 12])
    
    [f(x) for f, x in zip(l[a[idx], 1], param[idx])]
    # [1]
    

    Output with one_hot = np.array([0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1]) as input:

    [1, 2, 6, 15]  # [func1(1), func2(1), func3(2), func3(5)]