Search code examples
pythondictionarypython-3.xinverse

Generating an inverse index


I have the following:

strlist = ['the', 'the', 'boy', 'happy', 'boy', 'happy']
{x:{(list(enumerate(strlist))[y])[0]} for y in range(len(strlist)) for x in (strlist)}

My output is the following:

{'boy': set([5]), 'the': set([5]), 'happy': set([5])}

My issue is that I'd like to output this (using python 3.x):

{'boy': {2,4}, 'the': {0,1}, 'happy': {3,5} }

Any help would be great!

Thanks


Solution

  • >>> strlist = ['the', 'the', 'boy', 'happy', 'boy', 'happy']
    >>> from collections import defaultdict
    >>> D = defaultdict(set)
    >>> for i, s in enumerate(strlist):
    ...     D[s].add(i)
    ... 
    >>> D
    defaultdict(<type 'set'>, {'boy': {2, 4}, 'the': {0, 1}, 'happy': {3, 5}})
    

    If you can't use defaultdict for some reason

    >>> D = {}
    >>> for i, s in enumerate(strlist):
    ...     D.setdefault(s, set()).add(i)
    ... 
    >>> D
    {'boy': {2, 4}, 'the': {0, 1}, 'happy': {3, 5{}
    

    Here is the silly (inefficient) way to write it as a comprehension

    >>> {k: {i for i, j in enumerate(strlist) if j == k} for k in set(strlist)}
    {'boy': {2, 4}, 'the': {0, 1}, 'happy': {3, 5}}