I'm trying to assign dataframe columns to the row and column of the .loc
function. I have a DataFrame df
with no set index, with Sites
and Visits
as my column headers
Index | Site | Visit |
---|---|---|
0 | 101 | Visit 1 |
1 | 102 | Visit 1 |
2 | 102 | Visit 2 |
I have another dataframe df2
with Sites
as my dataframe index, and Visits
as my columns with Cost
as my values.
Index | Visit 1 | Visit 2 |
---|---|---|
101 | 50 | 60 |
102 | 100 | 120 |
I'm trying to use .loc
to index the Cost
from df2
and add it as a column to df1
like so
Index | Site | Visit | Cost |
---|---|---|---|
0 | 101 | Visit 1 | 50 |
1 | 102 | Visit 1 | 60 |
2 | 102 | Visit 2 | 120 |
I tried using the following code to provide a row and column value
df['Cost'] = df2.loc[df['Site'],df['Visit']]
But I got the following error:
KeyError: "Passing list-likes to .loc or [] with any missing labels is no longer supported.
The following labels were missing: etc. etc.
Any idea how to use two column values from one DataFrame as the .loc
row and column values?
df2
by stacking 'Visit 1'
and 'Visit 2'
into a single column, with the values in an adjacent column.merge()
import pandas as pd
# create df1
df1 = pd.DataFrame({'Index': ['0', '1', '2'], 'Site': ['101', '102', '102'], 'Visit': ['Visit 1', 'Visit 1', 'Visit 2']})
# drop the Index column
df1.drop(columns=['Index'], inplace=True)
# display(df1)
Site Visit
0 101 Visit 1
1 102 Visit 1
2 102 Visit 2
# create df2
df2 = pd.DataFrame({'Index': ['101', '102'], 'Visit 1': ['50', '100'], 'Visit 2': ['60', '120']})
# display(df2)
Index Visit 1 Visit 2
0 101 50 60
1 102 100 120
# stack Visit 1 and Visit 2 into a single column
df2 = df2.set_index('Index').stack().reset_index()
# rename the columns
df2.columns = ['Site', 'Visit', 'Cost']
# display(df2)
Site Visit Cost
0 101 Visit 1 50
1 101 Visit 2 60
2 102 Visit 1 100
3 102 Visit 2 120
cost = df1.merge(df2, on=['Site', 'Visit'])
# display(cost)
Site Visit Cost
0 101 Visit 1 50
1 102 Visit 1 100
2 102 Visit 2 120