I am working to try to group my data by a Department property and then once I have it grouped fill in the space above and below two fields that are filled in at the middle of the dataset (Rating and Number).
I have tried getting the groupby to work but to no avail. My plan is to get the groupby working then apply the following code to see if I can get the fill to work correctly.
# This won't work on its own because I need to group the data first.
df = df.mask(df == 0).ffill()
This is what I am starting with:
| Department | Range | Rating | Number | |--------------|----------|--------------|--------| | Admin | 0 (None) | | | | Admin | 01 to 3 | | | | Admin | 01 to 3 | | | | Admin | 01 to 3 | | | | Admin | 04 to 6 | 2. On Target | 2 | | Admin | 04 to 6 | 2. On Target | 2 | | Admin | 04 to 6 | 2. On Target | 2 | | Admin | 07 to 10 | | | | Admin | 07 to 10 | | | | Admin | 07 to 10 | | | | Admin | 07 to 10 | | | | Distribution | 0 (None) | | | | Distribution | 01 to 3 | | | | Distribution | 01 to 3 | | | | Distribution | 01 to 3 | | | | Distribution | 04 to 6 | 2. On Target | 2 | | Distribution | 04 to 6 | 2. On Target | 2 | | Distribution | 04 to 6 | 2. On Target | 2 | | Distribution | 07 to 10 | | | | Distribution | 07 to 10 | | | | Distribution | 07 to 10 | | | | Distribution | 07 to 10 | | |
This is what I want
| Department | Range | Rating | Number | |--------------|----------|--------------|--------| | Admin | 0 (None) | 1. Too Low | 1 | | Admin | 01 to 3 | 1. Too Low | 1 | | Admin | 01 to 3 | 1. Too Low | 1 | | Admin | 01 to 3 | 1. Too Low | 1 | | Admin | 04 to 6 | 2. On Target | 2 | | Admin | 04 to 6 | 2. On Target | 2 | | Admin | 04 to 6 | 2. On Target | 2 | | Admin | 07 to 10 | 3. Too High | 3 | | Admin | 07 to 10 | 3. Too High | 3 | | Admin | 07 to 10 | 3. Too High | 3 | | Admin | 07 to 10 | 3. Too High | 3 | | Distribution | 0 (None) | 1. Too Low | 1 | | Distribution | 01 to 3 | 1. Too Low | 1 | | Distribution | 01 to 3 | 1. Too Low | 1 | | Distribution | 01 to 3 | 1. Too Low | 1 | | Distribution | 04 to 6 | 2. On Target | 2 | | Distribution | 04 to 6 | 2. On Target | 2 | | Distribution | 04 to 6 | 2. On Target | 2 | | Distribution | 07 to 10 | 3. Too High | 3 | | Distribution | 07 to 10 | 3. Too High | 3 | | Distribution | 07 to 10 | 3. Too High | 3 | | Distribution | 07 to 10 | 3. Too High | 3 |
Is there any dynamic way to do this?
You can use pd.concat
with groupby
and utilise a custom function for filling logic:
# convert to numeric
df['Number'] = pd.to_numeric(df['Number'])
# assign values by index
def filler(x):
idx = np.where(x['Number'].notnull())[0]
x.iloc[:idx[0], -2:] = ['1. Too Low', 1]
x.iloc[idx[-1]+1:, -2:] = ['3. Too High', 3]
return x
# concatenate transformed dataframe slices
res = pd.concat(df_slice.pipe(filler) for _, df_slice in df.groupby('Department'))
Result:
print(res)
Department Range Rating Number
0 Admin 0 (None) 1. Too Low 1.0
1 Admin 01 to 3 1. Too Low 1.0
2 Admin 01 to 3 1. Too Low 1.0
3 Admin 01 to 3 1. Too Low 1.0
4 Admin 04 to 6 2. On Target 2.0
5 Admin 04 to 6 2. On Target 2.0
6 Admin 04 to 6 2. On Target 2.0
7 Admin 07 to 10 3. Too High 3.0
8 Admin 07 to 10 3. Too High 3.0
9 Admin 07 to 10 3. Too High 3.0
10 Admin 07 to 10 3. Too High 3.0
11 Distribution 0 (None) 1. Too Low 1.0
12 Distribution 01 to 3 1. Too Low 1.0
13 Distribution 01 to 3 1. Too Low 1.0
14 Distribution 01 to 3 1. Too Low 1.0
15 Distribution 04 to 6 2. On Target 2.0
16 Distribution 04 to 6 2. On Target 2.0
17 Distribution 04 to 6 2. On Target 2.0
18 Distribution 07 to 10 3. Too High 3.0
19 Distribution 07 to 10 3. Too High 3.0
20 Distribution 07 to 10 3. Too High 3.0
21 Distribution 07 to 10 3. Too High 3.0