Search code examples
pythonlisttuplespython-itertoolsiterable

Finding groups of increasing numbers in a list


The aim is to find groups of increasing/monotonic numbers given a list of integers. Each item in the resulting group must be of a +1 increment from the previous item

Given an input:

x = [7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5]

I need to find groups of increasing numbers and achieve:

increasing_numbers = [(7,8,9,10), (0,1,2,3,4,5)]

And eventually also the number of increasing numbers:

len(list(chain(*increasing_numbers)))

And also the len of the groups:

increasing_num_groups_length = [len(i) for i in increasing_numbers]

I have tried the following to get the number of increasing numbers:

>>> from itertools import tee, chain
>>> def pairwise(iterable): 
...     a, b = tee(iterable)
...     next(b, None)
...     return zip(a, b)
... 
>>> x = [8, 9, 10, 11, 7, 1, 2, 3, 4, 5, 6]
>>> set(list(chain(*[(i,j) for i,j in pairwise(x) if j-1==i])))
set([1, 2, 3, 4, 5, 6, 8, 9, 10, 11])
>>> len(set(list(chain(*[(i,j) for i,j in pairwise(x) if j-1==i]))))
10

But I'm unable to keep the order and the groups of increasing numbers.

How can I achieve the increasing_numbers groups of integer tuples and also the increasing_num_groups_length?

Also, is there a name for such/similar problem?


EDITED

I've came up with this solution but it's super verbose and I'm sure there's an easier way to achieve the increasing_numbers output:

>>> from itertools import tee, chain
>>> def pairwise(iterable): 
...     a, b = tee(iterable)
...     next(b, None)
...     return zip(a, b)
... 
>>> x = [8, 9, 10, 11, 7, 1, 2, 3, 4, 5, 6]
>>> boundary =  iter([0] + [i+1 for i, (j,k) in enumerate(pairwise(x)) if j+1!=k] + [len(x)])
>>> [tuple(x[i:next(boundary)]) for i in boundary]
[(8, 9, 10, 11), (1, 2, 3, 4, 5, 6)]

Is there a more pythonic / less verbose way to do this?


Another input/output example:

[in]:

[17, 17, 19, 20, 21, 22, 0, 1, 2, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 14, 14, 28, 29, 30, 31, 32, 33, 34, 35, 36, 40]

[out]:

[(19, 20, 21, 22), (0, 1, 2), (4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14), (28, 29, 30, 31, 32, 33, 34, 35, 36)]


Solution

  • A couple of different ways using itertools and numpy:

    from itertools import groupby, tee, cycle
    
    x = [17, 17, 19, 20, 21, 22, 0, 1, 2, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 14, 14, 28, 29, 30, 31, 32, 33, 34, 35,
         36, 1, 2, 3, 4,34,54]
    
    
    def sequences(l):
        x2 = cycle(l)
        next(x2)
        grps = groupby(l, key=lambda j: j + 1 == next(x2))
        for k, v in grps:
            if k:
                yield tuple(v) + (next((next(grps)[1])),)
    
    
    print(list(sequences(x)))
    
    [(19, 20, 21, 22), (0, 1, 2), (4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14), (28, 29, 30, 31, 32, 33, 34, 35, 36), (1, 2, 3, 4)]
    

    Or using python3 and yield from:

    def sequences(l):
        x2 = cycle(l)
        next(x2)
        grps = groupby(l, key=lambda j: j + 1 == next(x2))
        yield from (tuple(v) + (next((next(grps)[1])),) for k,v in grps if k)
    
    print(list(sequences(x)))
    

    Using a variation of my answer here with numpy.split :

    out = [tuple(arr) for arr in np.split(x, np.where(np.diff(x) != 1)[0] + 1) if arr.size > 1]
    
    print(out)
    
    [(19, 20, 21, 22), (0, 1, 2), (4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14), (28, 29, 30, 31, 32, 33, 34, 35, 36), (1, 2, 3, 4)]
    

    And similar to ekhumoro's answer:

    def sequences(x):
        it = iter(x)
        prev, temp = next(it), []
        while prev is not None:
            start = next(it, None)
            if prev + 1 == start:
                temp.append(prev)
            elif temp:
                yield tuple(temp + [prev])
                temp = []
            prev = start
    

    To get the length and the tuple:

    def sequences(l):
        x2 = cycle(l)
        next(x2)
        grps = groupby(l, key=lambda j: j + 1 == next(x2))
        for k, v in grps:
            if k:
                t = tuple(v) + (next(next(grps)[1]),)
                yield t, len(t)
    
    
    def sequences(l):
        x2 = cycle(l)
        next(x2)
        grps = groupby(l, lambda j: j + 1 == next(x2))
        yield from ((t, len(t)) for t in (tuple(v) + (next(next(grps)[1]),)
                                          for k, v in grps if k))
    
    
    
    def sequences(x):
            it = iter(x)
            prev, temp = next(it), []
            while prev is not None:
                start = next(it, None)
                if prev + 1 == start:
                    temp.append(prev)
                elif temp:
                    yield tuple(temp + [prev]), len(temp) + 1
                    temp = []
                prev = start
    

    Output will be the same for all three:

    [((19, 20, 21, 22), 4), ((0, 1, 2), 3), ((4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14), 11)
    , ((28, 29, 30, 31, 32, 33, 34, 35, 36), 9), ((1, 2, 3, 4), 4)]