I have 2 dataframes and I want to fill the na in df1 below, with the data that's in df2 but were the team column matches either the h or v column in df2:
So in other words fillna for df1['temp', 'wspd', 'cond'] with those vales in df2 where df1['team'] == df2['h'] or df2['v']
df1:
team day temp ou wspd cond
CIN SUN nan 42.0 nan nan
KC MN nan 43.5 nan nan
IND SUN nan 42.0 nan nan
CIN SUN nan 42.0 nan nan
SEA SUN nan 45.5 nan nan
PIT SN nan 45.0 nan nan
MIA THU nan 37.5 nan nan
DEN MN nan 43.5 nan nan
NYJ SUN nan 46.5 nan nan
CAR SUN nan 44.5 nan nan
LAC SUN nan 48.0 nan nan
MIN SUN nan 45.0 nan nan
NO SUN nan 48.5 nan nan
NYJ SUN nan 46.5 nan nan
CLE SUN nan 37.5 nan nan
OAK SUN nan 45.5 nan nan
CAR SUN nan 44.5 nan nan
OAK SUN nan 45.5 nan nan
... ... ... ... ... ...
df2:
v h temp humd wspd cond
MIA BAL 60.0 41.0 12.0 Mostly Sunny
MIN CLE 47.0 58.0 14.0 Cloudy
CHI NO 70.0 0.0 0.0 Dome
ATL NYJ 66.0 83.0 17.0 Rain
IND CIN 44.0 58.0 12.0 Cloudy
SF PHI 65.0 81.0 11.0 Rain
OAK BUF 49.0 65.0 9.0 Cloudy
LAC NE 64.0 75.0 13.0 Showers
CAR TB 67.0 59.0 17.0 Partly Cloudy
HOU SEA 58.0 73.0 4.0 Cloudy
DAL WAS 58.0 66.0 6.0 Rain
PIT DET 70.0 0.0 0.0 Dome
DEN KC 55.0 40.0 12.0 Partly Cloudy
I would approach this by first reshaping df2
so that columns v
and h
are in one column, and then setting that column as the index:
df2 = df2.melt(value_vars=['v','h'], \
id_vars=['temp','humd','wspd','cond'], value_name='team').set_index('team')
# temp humd wspd cond variable
# team
# MIA 60.0 41.0 12.0 Mostly Sunny v
# MIN 47.0 58.0 14.0 Cloudy v
# CHI 70.0 0.0 0.0 Dome v
# ATL 66.0 83.0 17.0 Rain v
# IND 44.0 58.0 12.0 Cloudy v
From there, you can set team
as the index in df
and use df.update
:
df1.set_index('team', inplace=True)
df1.update(df2[['temp', 'wspd', 'cond']])
# day temp ou wspd cond
# team
# CIN SUN 44.0 42.0 12.0 Cloudy
# KC MN 55.0 43.5 12.0 Partly Cloudy
# IND SUN 44.0 42.0 12.0 Cloudy
# CIN SUN 44.0 42.0 12.0 Cloudy
# SEA SUN 58.0 45.5 4.0 Cloudy