I have the following dataframe:
import pandas as pd
data = [['Construction', '', '01/02/2022', '01/06/2022', '1', 'No'], ['Level Site', 'Construction', '01/02/2022', '01/02/2022', '2', 'No'], ['Foundation', '', '01/03/2023', '01/06/2023', '1', 'Yes'],['Lay Foundation', 'Construction>Foundation', '01/03/2022', '01/04/2022', '3', 'No'], ['Prepare land for foundation', 'Construction>Foundation', '01/05/2022', '01/06/2022', '3', 'No'],['Building Envelope', '', '01/07/2023', '01/16/2023', '1', 'No'], ['Install Footings', 'Building Envelope', '01/07/2022', '01/07/2022', '2', 'Yes'], ['Pouring', '', '01/08/202', '01/09/2023', '1', 'No'],['Pour Foundation', 'Building Envelope>Pouring', '01/08/2022', '01/09/2022', '3', 'No'], ['Installation', '', '01/09/2022', '01/14/2022', '1', 'No']]
df1 = pd.DataFrame(data, columns=['Activity', 'Parent', 'Start', 'Finish', 'WBS Level', 'Match'])
df1
#Ideal dataframe output
data = [['Construction', '', '01/02/2022', '01/06/2022', '1', 'No'],['Foundation', '', '01/03/2023', '01/06/2023', '1', 'Yes'], ['Level Site', 'Construction', '01/02/2022', '01/02/2022', '2', 'No'], ['Lay Foundation', 'Construction>Foundation', '01/03/2022', '01/04/2022', '3', 'No'], ['Prepare land for foundation', 'Construction>Foundation', '01/05/2022', '01/06/2022', '3', 'No'],['Install Footings', 'Building Envelope', '01/07/2022', '01/07/2022', '2', 'Yes'],['Building Envelope', '', '01/07/2023', '01/16/2023', '1', 'No'], ['Pouring', '', '01/08/202', '01/09/2023', '1', 'No'],['Pour Foundation', 'Building Envelope>Pouring', '01/08/2022', '01/09/2022', '3', 'No'], ['Installation', '', '01/09/2022', '01/14/2022', '1', 'No']]
df2 = pd.DataFrame(data, columns=['Activity', 'Parent', 'Start', 'Finish', 'WBS Level', 'Match'])
df2
I am prepping this data for use in a scheduling software application, and need to re-order the rows based on certain conditions. I created the 'match' column for this purpose (I have already created my condition, any row that is 'yes' has fulfilled the condition).
For any row that has a 'yes' value in the 'match' column I want to shift up one row. I have tried the variations of the .shift method but I'm having trouble getting it right. I don't want to delete or override any rows, I just need to shift any 'yes' rows up by 1.
Thank you for your help
Here's a solution that swaps the rows required using indices rather than using .shift()
(since it's not clear to me how you'd do this within a groupby()
). Might not scale hugely well but should do the job on smaller datasets.
df1 = df1.reset_index(drop=True) # ensure index is unique
# Loop through only the indices of rows to be shifted, to avoid looping through every row
shift_indices = df1[df1['Match'] == 'Yes'].index
for shift_idx in shift_indices:
# No need to shift if at the top
if shift_idx == 0:
continue
above_idx = shift_idx - 1
above_row = df1.loc[above_idx].copy() # copy as otherwise this row will change during the shift
# If the row above is also a match, then no need to swap it
if above_row['Match'] != 'Yes':
shift_row = df1.loc[shift_idx]
df1.loc[above_idx] = shift_row
df1.loc[shift_idx] = above_row