Search code examples
pythonpython-itertools

pythonic way of removing similar items from list


I have a list of items from which i want to remove all similar values but the first and the last one. For example:

listIn = [1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1]
  1. First three elements "1, 1, 1" are similar, so remove the middle "1".
  2. Next two zeros are unmodified.
  3. One is just one. Leave unmodified.
  4. Four zeros. Remove items in-between the first and the last.

Resulting in:

listOut = [1, 1, 0, 0, 1, 0, 0, 1]

The way of doing this in c++ is very obvious, but it looks very different from the python coding style. Or is it the only way?

Basically, just removing excessive points on the graph where "y" value is not changed: enter image description here


Solution

  • Use itertools.groupby() to group your values:

    from itertools import groupby
    
    listOut = []
    for value, group in groupby(listIn):
        listOut.append(next(group))
        for i in group:
            listOut.append(i)
            break
    

    or, for added efficiency, as a generator:

    from itertools import groupby
    
    def reduced(it):
        for value, group in groupby(it):
            yield next(group)
            for i in group:
                yield i
                break
    

    Demo:

    >>> listIn = [1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1]
    >>> list(reduced(listIn))
    [1, 1, 0, 0, 1, 0, 0, 1]