I have a datafrmae "dfnan" that has NaN values and I need to replace those values with data from a different dataframe "dffill" with specific row insert position requirements by "name" and "month". My data looks like this for dfnan:
index result result result result result month name year
1 4 4 4 4 4 1 Bears 2022
2 20 20 20 20 20 2 Bears 2022
3 8 8 8 8 8 3 Bears 2022
4 5 5 5 5 5 4 Bears 2022
5 3 3 3 3 3 5 Bears 2022
6 19 19 19 19 19 6 Bears 2022
7 nan nan nan nan nan 7 Bears 2022
8 nan nan nan nan nan 8 Bears 2022
9 nan nan nan nan nan 9 Bears 2022
10 nan nan nan nan nan 10 Bears 2022
11 nan nan nan nan nan 11 Bears 2022
12 nan nan nan nan nan 12 Bears 2022
13 5 5 5 5 5 1 Eagles 2022
14 9 9 9 9 9 2 Eagles 2022
15 12 12 12 12 12 3 Eagles 2022
16 21 21 21 21 21 4 Eagles 2022
17 2 2 2 2 2 5 Eagles 2022
18 17 17 17 17 17 6 Eagles 2022
19 nan nan nan nan nan 7 Eagles 2022
20 nan nan nan nan nan 8 Eagles 2022
21 nan nan nan nan nan 9 Eagles 2022
22 nan nan nan nan nan 10 Eagles 2022
23 nan nan nan nan nan 11 Eagles 2022
24 nan nan nan nan nan 12 Eagles 2022
The data i need to fill the NaN values with is here in "dffill":
index month name 1 2 3 4 5
1 7 Bears 10 25 14 4 22
2 8 Bears 5 8 6 24 18
3 9 Bears 18 8 8 14 16
4 10 Bears 19 11 13 8 9
5 11 Bears 16 25 3 9 6
6 12 Bears 17 11 18 3 24
7 7 Eagles 15 24 11 2 25
8 8 Eagles 1 7 18 9 17
9 9 Eagles 11 11 8 18 20
10 10 Eagles 16 20 3 24 2
11 11 Eagles 10 24 6 4 19
12 12 Eagles 8 16 12 19 22
I am sorry but I cannot understand how to insert this row data in the correct position given the requirements of same "name" and "month". Thank you for your help and here is the final result.
index result result result result result month name year
1 4 4 4 4 4 1 Bears 2022
2 20 20 20 20 20 2 Bears 2022
3 8 8 8 8 8 3 Bears 2022
4 5 5 5 5 5 4 Bears 2022
5 3 3 3 3 3 5 Bears 2022
6 19 19 19 19 19 6 Bears 2022
7 10 25 14 4 22 7 Bears 2022
8 5 8 6 24 18 8 Bears 2022
9 18 8 8 14 16 9 Bears 2022
10 19 11 13 8 9 10 Bears 2022
11 16 25 3 9 6 11 Bears 2022
12 17 11 18 3 24 12 Bears 2022
13 5 5 5 5 5 1 Eagles 2022
14 9 9 9 9 9 2 Eagles 2022
15 12 12 12 12 12 3 Eagles 2022
16 21 21 21 21 21 4 Eagles 2022
17 2 2 2 2 2 5 Eagles 2022
18 17 17 17 17 17 6 Eagles 2022
19 15 24 11 2 25 7 Eagles 2022
20 1 7 18 9 17 8 Eagles 2022
21 11 11 8 18 20 9 Eagles 2022
22 16 20 3 24 2 10 Eagles 2022
23 10 24 6 4 19 11 Eagles 2022
24 8 16 12 19 22 12 Eagles 2022
here is one way to do it, which is to make use of df.update assuming its ok to set the index and rename the df2 columns
#set the index on both the DF
df.set_index(['name','month'], inplace=True)
df2.set_index(['name','month'], inplace=True)
#match the columns names b/w df and df2, by taking df columns to rename df2
df2.columns=['result.' + str(int(col)-1) if str(col).isdigit() else col for col in df2.columns]
df2.rename(columns={'result.0':'result'}, inplace=True)
#use update to update the values
df.update(df2, overwrite=False)
#reset the index, in needed
df.reset_index()
name month index result result.1 result.2 result.3 result.4 year
0 Bears 1 1 4.0 4.0 4.0 4.0 4.0 2022
1 Bears 2 2 20.0 20.0 20.0 20.0 20.0 2022
2 Bears 3 3 8.0 8.0 8.0 8.0 8.0 2022
3 Bears 4 4 5.0 5.0 5.0 5.0 5.0 2022
4 Bears 5 5 3.0 3.0 3.0 3.0 3.0 2022
5 Bears 6 6 19.0 19.0 19.0 19.0 19.0 2022
6 Bears 7 7 10.0 25.0 14.0 4.0 22.0 2022
7 Bears 8 8 5.0 8.0 6.0 24.0 18.0 2022
8 Bears 9 9 18.0 8.0 8.0 14.0 16.0 2022
9 Bears 10 10 19.0 11.0 13.0 8.0 9.0 2022
10 Bears 11 11 16.0 25.0 3.0 9.0 6.0 2022
11 Bears 12 12 17.0 11.0 18.0 3.0 24.0 2022
12 Eagles 1 13 5.0 5.0 5.0 5.0 5.0 2022
13 Eagles 2 14 9.0 9.0 9.0 9.0 9.0 2022
14 Eagles 3 15 12.0 12.0 12.0 12.0 12.0 2022
15 Eagles 4 16 21.0 21.0 21.0 21.0 21.0 2022
16 Eagles 5 17 2.0 2.0 2.0 2.0 2.0 2022
17 Eagles 6 18 17.0 17.0 17.0 17.0 17.0 2022
18 Eagles 7 19 15.0 24.0 11.0 2.0 25.0 2022
19 Eagles 8 20 1.0 7.0 18.0 9.0 17.0 2022
20 Eagles 9 21 11.0 11.0 8.0 18.0 20.0 2022
21 Eagles 10 22 16.0 20.0 3.0 24.0 2.0 2022
22 Eagles 11 23 10.0 24.0 6.0 4.0 19.0 2022
23 Eagles 12 24 8.0 16.0 12.0 19.0 22.0 2022