Search code examples
pythonone-hot-encoding

How to convert set of tags to belongness tuple ("multi-hot" encoding) in Python easy?


I have a set of tags tags and an ordered list of all possible tags taglist. Now I want to convert set of tags into "multi-hot" encoding, i.e. get a list or tuple with the same length as taglist and which has ones in places, where belonging tag resides and zeros in other placess.

Currently I do traightforward:

        multihot = []
        for i in range(len(taglist)):
            tag = taglist[i]
            if tag in tags:
                multihot.append(1)
            else:
                multihot.append(0)

Is it possible to write one-liner?


Solution

  • You can reach it with list comprehension, adding condition whether tag is in tags, if yes, insert 1 else 0.

    multihot = [1 if tag in tags else 0 for tag in taglist]