Search code examples
pythonnumpypython-itertools

Split a list into increasing sequences using itertools


I have a list with mixed sequences like

[1,2,3,4,5,2,3,4,1,2]

I want to know how I can use itertools to split the list into increasing sequences cutting the list at decreasing points. For instance the above would output

[[1, 2, 3, 4, 5], [2, 3, 4], [1, 2]]

this has been obtained by noting that the sequence decreases at 2 so we cut the first bit there and another decrease is at one cutting again there.

Another example is with the sequence

[3,2,1]

the output should be

[[3], [2], [1]]

In the event that the given sequence is increasing we return the same sequence. For example

[1,2,3]

returns the same result. i.e

[[1, 2, 3]]

For a repeating list like

[ 1, 2,2,2, 1, 2, 3, 3, 1,1,1, 2, 3, 4, 1, 2, 3, 4, 5, 6]

the output should be

[[1, 2, 2, 2], [1, 2, 3, 3], [1, 1, 1, 2, 3, 4], [1, 2, 3, 4, 5, 6]]

What I did to achieve this is define the following function

def splitter (L):
    result = []
    tmp = 0
    initialPoint=0
    for i in range(len(L)):
       if (L[i] < tmp):
           tmpp = L[initialPoint:i]
           result.append(tmpp)
           initialPoint=i
       tmp = L[i]
    result.append(L[initialPoint:])
    return result

The function is working 100% but what I need is to do the same with itertools so that I can improve efficiency of my code. Is there a way to do this with itertools package to avoid the explicit looping?


Solution

  • With numpy, you can use numpy.split, this requires the index as split positions; since you want to split where the value decreases, you can use numpy.diff to calculate the difference and check where the difference is smaller than zero and use numpy.where to retrieve corresponding indices, an example with the last case in the question:

    import numpy as np
    lst = [ 1, 2,2,2, 1, 2, 3, 3, 1,1,1, 2, 3, 4, 1, 2, 3, 4, 5, 6]
    np.split(lst, np.where(np.diff(lst) < 0)[0] + 1)
    
    # [array([1, 2, 2, 2]),
    #  array([1, 2, 3, 3]),
    #  array([1, 1, 1, 2, 3, 4]),
    #  array([1, 2, 3, 4, 5, 6])]