Search code examples
pythonpandasdataframecomparisonpairwise

Pandas\Python: Creating a new data frame as a result of pairwise comparison


I have a sample DataFrame.

df = pd.DataFrame({'time':['12:00','12:01','12:02','12:03','12:04','12:05','12:06','12:07'], 'begin':[6880,6930,6920,7095,7025,7300,7130,7110],
                  'up':[7034,6995,7105,7105,7415,7420,7230,7195],'down':[6880,6845,6869,6885,6894,7090,7045,6990],'end':[6930,6920,7095,7025,7300,7130,7110,7055]})
df = df.set_index('time')

        begin   up      down    end
time                
12:00   6880    7034    6880    6930
12:01   6930    6995    6845    6920
12:02   6920    7105    6869    7095
12:03   7095    7105    6885    7025
12:04   7025    7415    6894    7300
12:05   7300    7420    7090    7130
12:06   7130    7230    7045    7110
12:07   7110    7195    6990    7055

Algorithm:

  1. For first and second rows of index column time: (will be the same as first row) = 12:00

  2. For first and second rows of column begin: (will be 'begin' of first row) new_begin = 6880

  3. For first and second rows of column up: if 'up_row1' > 'up_row2': new_up = up_row1 else: up_row2

  4. For first and second rows of column down: if 'down_row1' < 'down_row2': new_down = down_row1 else: down_row2

  5. For first and second rows of column end: (will be 'end' of second row) new_end = 6920

    And so on for third and fourth rows and other pairs of rows

So result must be exactly like this one

        begin   up      down    end
time                
12:00   6880    7034    6845    6920
12:02   6920    7105    6869    7025
12:04   7025    7420    6894    7130
12:06   7130    7230    6990    7055

Thanks in advance for your help!


Solution

  • You can groupby the dataframe on a custom pair wise grouper, then agg using the dictionary dct:

    dct = {'time': 'first', 'begin': 'first',
           'up': 'max', 'down': 'min', 'end': 'last'}
    df = df.reset_index().groupby(np.arange(len(df)) // 2).agg(dct).set_index('time')
    

           begin    up  down   end
    time                          
    12:00   6880  7034  6845  6920
    12:02   6920  7105  6869  7025
    12:04   7025  7420  6894  7130
    12:06   7130  7230  6990  7055