Search code examples
pythonpandasconditional-statementsshift

Shift Rows in Pandas Dataframe by Condition?


I have the following dataframe:

import pandas as pd

data = [['Construction', '', '01/02/2022', '01/06/2022', '1', 'No'], ['Level Site', 'Construction', '01/02/2022', '01/02/2022', '2', 'No'], ['Foundation', '', '01/03/2023', '01/06/2023', '1', 'Yes'],['Lay Foundation', 'Construction>Foundation', '01/03/2022', '01/04/2022', '3', 'No'], ['Prepare land for foundation', 'Construction>Foundation', '01/05/2022', '01/06/2022', '3', 'No'],['Building Envelope', '', '01/07/2023', '01/16/2023', '1', 'No'], ['Install Footings', 'Building Envelope', '01/07/2022', '01/07/2022', '2', 'Yes'], ['Pouring', '', '01/08/202', '01/09/2023', '1', 'No'],['Pour Foundation', 'Building Envelope>Pouring', '01/08/2022', '01/09/2022', '3', 'No'], ['Installation', '', '01/09/2022', '01/14/2022', '1', 'No']]
df1 = pd.DataFrame(data, columns=['Activity', 'Parent', 'Start', 'Finish', 'WBS Level', 'Match'])

df1

#Ideal dataframe output

data = [['Construction', '', '01/02/2022', '01/06/2022', '1', 'No'],['Foundation', '', '01/03/2023', '01/06/2023', '1', 'Yes'], ['Level Site', 'Construction', '01/02/2022', '01/02/2022', '2', 'No'], ['Lay Foundation', 'Construction>Foundation', '01/03/2022', '01/04/2022', '3', 'No'], ['Prepare land for foundation', 'Construction>Foundation', '01/05/2022', '01/06/2022', '3', 'No'],['Install Footings', 'Building Envelope', '01/07/2022', '01/07/2022', '2', 'Yes'],['Building Envelope', '', '01/07/2023', '01/16/2023', '1', 'No'], ['Pouring', '', '01/08/202', '01/09/2023', '1', 'No'],['Pour Foundation', 'Building Envelope>Pouring', '01/08/2022', '01/09/2022', '3', 'No'], ['Installation', '', '01/09/2022', '01/14/2022', '1', 'No']]
df2 = pd.DataFrame(data, columns=['Activity', 'Parent', 'Start', 'Finish', 'WBS Level', 'Match'])

df2

I am prepping this data for use in a scheduling software application, and need to re-order the rows based on certain conditions. I created the 'match' column for this purpose (I have already created my condition, any row that is 'yes' has fulfilled the condition).

For any row that has a 'yes' value in the 'match' column I want to shift up one row. I have tried the variations of the .shift method but I'm having trouble getting it right. I don't want to delete or override any rows, I just need to shift any 'yes' rows up by 1.

Thank you for your help


Solution

  • Here's a solution that swaps the rows required using indices rather than using .shift() (since it's not clear to me how you'd do this within a groupby()). Might not scale hugely well but should do the job on smaller datasets.

    df1 = df1.reset_index(drop=True)  # ensure index is unique
    
    # Loop through only the indices of rows to be shifted, to avoid looping through every row
    shift_indices = df1[df1['Match'] == 'Yes'].index
    for shift_idx in shift_indices:
        # No need to shift if at the top
        if shift_idx == 0:
            continue
        above_idx = shift_idx - 1
        above_row = df1.loc[above_idx].copy()  # copy as otherwise this row will change during the shift
        # If the row above is also a match, then no need to swap it
        if above_row['Match'] != 'Yes':
            shift_row = df1.loc[shift_idx]
            df1.loc[above_idx] = shift_row
            df1.loc[shift_idx] = above_row