python pandas replace conditional-statements records

Pandas: replace records based on conditional test in column

I have unique records in a dataframe, with no duplicates, as determined by combination of values across several columns:

import pandas as pd
d = {'Alpha' : ['C', 'B', 'C','D', 'A', 'A'], 'Beta' : ['G', 'F', 'G', 'H', 'A', 'A'],'Year': ['Base', 88, 94, 22, 'Base', 66], 'Zulu' : [1, 2, -3, 4, 5, -3]}
df = pd.DataFrame(d)

This gets us:

  Alpha Beta  Year  Zulu
0     C    G  Base     1
1     B    F    88     2
2     C    G    94    -3
3     D    H    22     4
4     A    A  Base     5
5     A    A    66    -4

Index 0 is a near match with index 2, & index 4 is a near match with index 5, except that the 'Year' value for index 0 & index 4 is Base, and 'Zulu' value for index 2 & 5 are negative. I want to replace the negative 'Zulu' values in 2 & 5 with their corresponding Zulu values in index 0 & 4, without disrupting other rows. The output df would look like this:

  Alpha Beta  Year  Zulu
0     C    G  Base     1
1     B    F    88     2
2     C    G    94     1
3     D    H    22     4
4     A    A  Base     5
5     A    A    66     5

I can easily start with a filter & deep copy to avoid slicing issues, & then a backup of original data:

df_sub = df[df.Zulu < 0].copy(deep=True)
df_sub['Zulu_backup'] = df_sub.Zulu

I can then use df_sub to merge with the original:

df_result = pd.merge(df_sub, df, how='right')

Resulting in:

Alpha Beta  Year    Zulu     Zulu_backup
0     C    G    94    -3         -3.0
1     A    A    66    -3         -3.0
2     C    G  Base     1          NaN
3     B    F    88     2          NaN
4     D    H    22     4          NaN
5     A    A  Base     5          NaN

But I don't know where to go from here without messing up the dataframe's other rows. I have a rather convoluted script. I am guessing I could create a series and map it to the dataframe, but I'm at a loss as to how to do so. Iterating through this is not an option as I have millions of rows. Would appreciate any assistance.

Solution

One way would be to convert the negative values in Zulu to nan and then fillna

df.loc[df['Zulu'] < 0, 'Zulu'] = np.nan
df['Zulu'] = df.groupby(['Alpha', 'Beta']).Zulu.apply(lambda x: x.ffill().bfill()).astype(int)


    Alpha   Beta    Year    Zulu
0   C       G       Base    1
1   B       F       88      2
2   C       G       94      1
3   D       H       22      4
4   A       A       Base    5
5   A       A       66      5