Search code examples
pythonpandaslistdataframeoverlap

Python - save x values from a column to list when a number in another column is bigger than the number before


(I'm new to this so I might not know the best way to formulate this question but here goes)

I'm using pandas btw!

I am looking for a construct to each time save x values (let's say x=3 so three values) from one column to a list when a number in another column is bigger than the number before. First number should not be considered. Important to note that there are also missing data in the number column (NaN).

NumberColumn   Value
23             4
23             5
NaN            8
24             6    # <--
24             23   # <--
24             26   # <--
24             11
25             2    # <--
25             1    # <--
25             3    # <--
25             5
Nan            9
26             4    # <--
26             6    # <--
NaN            9    # <--
26             12

Hopefully in the end the new list contains:

List1 = [6,23,26,2,1,3,4,6,9]

But also they might overlap so:

NumberColumn       Value
    27             4
    28             5    # <--
    NaN            8    # <--
    29             6    # <--   <-- overlap!
    29             23         # <--
    29             26         # <--

Hopefully here the list will contain:

List1 = [5, 8, 6, 6, 23, 26]

Thanks for helping me out !


Solution

  • With the assumption that numbers cannot be smaller than 0:

    list1 = []
    x=3
    
    df.fillna(value=0, inplace=True)
    for i in range(1, len(df)):
        if df.loc[i, 'NumberColumn']>df.loc[i-1, 'NumberColumn']:
            list1 = list1 + df.loc[i:i+x-1, 'Value'].values.tolist()
    print(list1)
    

    As a function:

    def createList(df, x):
        list1 = []
        df.fillna(value=0, inplace=True)
        for i in range(1, len(df)):
            if df.loc[i, 'NumberColumn']>df.loc[i-1, 'NumberColumn']:
                list1 = list1 + df.loc[i:i+x-1, 'Value'].values.tolist()
        return list1
    
    list1 = createList(df, x=3)
    print(list1)