Search code examples
pythonlistdeduplication

Python 2.7: Dedup list by adding suffix


I'm not sure I'm thinking about this problem correctly. I'd like to write a function which takes a list with duplicates and appends an iterating suffix to "dedup" the list.

For example:

dup_list = ['apple','banana','cherry','banana','cherry','orange','cherry']

Aiming to return:

deduped = ['apple','banana1','cherry1','banana2','cherry2','orange','cherry3']

My instinct was to use the pop function while iterating over the list with a while statement, like so:

def dedup_suffix(an_list):
dedup=[]
for each in an_list:
    an_list.pop(an_list.index(each)) #pop it out
    i=1 #iterator  
    while each in an_list:
        an_list.pop(an_list.index(each))
        i+=1
        appendage=str(each)+"_"+str(i)
    else:
        appendage=str(each)
    dedup.append(appendage)
return dedup

But:

>>> dedup_suffix(dup_list)

['apple', 'cherry', 'orange']

Appreciate any pointers.


Solution

  • You can use a Counter to keep track of the number of occurrences. I'm assuming your example is correct with respect to apple, so that you don't want to add a zero to the first occurrence. For that you need a bit of logic:

    from collections import Counter
    counter = Counter()
    
    dup_list = ['apple','banana','cherry','banana','cherry','orange','cherry']
    deduped = []
    for name in dup_list:
        new = name + str(counter[name]) if counter[name] else name
        counter.update({name: 1})
        deduped.append(new)