Search code examples
pythongroup-bymaxmin

Python How to locate a value in a different column based off two other columns


I have a data set that has 4 columns in it. One with random numbers, one with 1s and 0s, another that breaks it into segments and one that sees if the value is different than the one before it.

Ex Data Set:

Column1   Column2  Column3  Column4
  10         1       1        1
  10         1       1        0
  14         1       1        1
  14         1       1        0
  16         0       0        1
  17         1       2        1
  17         1       2        0
  19         1       2        1
  20         0       0        1
  24         0       0        1
  25         1       3        1
  25         1       3        0
  25         1       3        0
  29         1       3        1
  30         1       3        1
  30         1       3        0

I am trying to get the first and last value of column1 inside of each segment based off of the criteria of column 4. So inside of the segment, the first-time column 4 has a 1, I want that value returned and the last time column 4 has a 1, I want that value returned as well.

What I want returned:

Column3   Column45(first)  Column6 (last)
  1          10              14
  2          17              19
  3          25              30

What I have tried (does not work):

firstlast= (df.assign(fl=df['Column1'].abs()).groupby([['Column3' , 'Column4']]).size()['Column1'].agg([('min' , 'min'), ('max', 'max')]).add_prefix('Column1')) 

print (firstlast)

Still trying to get use to Python so I am open to new ways of trying to achieve this!


Solution

  • Try this:

    df_filtered = df[df['Column4'] == 1]
    firstlast = df_filtered.groupby('Column3')['Column1'].agg([('Column5', 'first'), ('Column6', 'last')])