Search code examples
pythondictionaryindexingenumerate

How do I keep the index of the duplicate element unchanged


Here is a input list:

['a', 'b', 'b', 'c', 'c', 'd']

The output I expect should be:

[[0, 'a'], [1, 'b'],  [1, 'b'], [2, 'c'], [2, 'c'], [3, 'd']]

I try to use map()

>>> map(lambda (index, word): [index, word], enumerate([['a', 'b', 'b', 'c', 'c', 'd']])
[[0, 'a'], [1, 'b'], [2, 'b'], [3, 'c'], [4, 'c'], [5, 'd']]

How can I get the expected result?

EDIT: This is not a sorted list, the index of each element increase only when meet a new element


Solution

  • It sounds like you want to rank the terms based on a lexicographical ordering.

    input = ['a', 'b', 'b', 'c', 'c', 'd']
    mapping = { v:i for (i, v) in enumerate(sorted(set(input))) }
    [ [mapping[v], v] for v in input ]
    

    Note that this works for unsorted inputs as well.

    If, as your amendment suggests, you want to number items based on order of first appearance, a different approach is in order. The following is short and sweet, albeit offensively hacky:

    [ [d.setdefault(v, len(d)), v] for d in [{}] for v in input ]