I have two dataframes that I am trying to join in some way to produce a third dataframe.
I am doing this in python.
df1:
Date | A | B | C |
---|---|---|---|
27/04/2023 00:00 | 55.6637 | 0.32194 | 0.145006 |
28/04/2023 00:00 | 57.1432 | 0.327949 | 0.14712 |
29/04/2023 00:00 | 57.3842 | 0.322167 | 0.146119 |
30/04/2023 00:00 | 57.1259 | 0.320318 | 0.145881 |
df2:
Date | change | Ticker |
---|---|---|
27/04/2023 00:00 | 0.1 | A |
28/04/2023 00:00 | -0.1 | A |
29/04/2023 00:00 | 0.2 | A |
30/04/2023 00:00 | -0.2 | A |
27/04/2023 00:00 | 0.3 | B |
28/04/2023 00:00 | -0.3 | B |
29/04/2023 00:00 | 0.4 | B |
30/04/2023 00:00 | -0.4 | B |
27/04/2023 00:00 | 0.5 | C |
28/04/2023 00:00 | -0.5 | C |
29/04/2023 00:00 | 0.6 | C |
30/04/2023 00:00 | -0.6 | C |
I want to join the two dataframes so that the value in df1 are joined to df2 on Date and where the ticker in df2 is equal to the column name in df1.
The result should look like this:
Date | change | Ticker | Px |
---|---|---|---|
27/04/2023 00:00 | 0.1 | A | 55.6637 |
28/04/2023 00:00 | -0.1 | A | 57.1432 |
29/04/2023 00:00 | 0.2 | A | 57.3842 |
30/04/2023 00:00 | -0.2 | A | 57.1259 |
27/04/2023 00:00 | 0.3 | B | 0.32194 |
28/04/2023 00:00 | -0.3 | B | 0.327949 |
29/04/2023 00:00 | 0.4 | B | 0.322167 |
30/04/2023 00:00 | -0.4 | B | 0.320318 |
27/04/2023 00:00 | 0.5 | C | 0.145006 |
28/04/2023 00:00 | -0.5 | C | 0.14712 |
29/04/2023 00:00 | 0.6 | C | 0.146119 |
30/04/2023 00:00 | -0.6 | C | 0.145881 |
I have tried using .iloc[]
but I couldnt get it to work
Create DataFrames:
import pandas as pd
dates = ['27/04/2023 00:00', '28/04/2023 00:00']
df1 = pd.DataFrame({'Date': dates,'A': [55.6, 57.2],'B': [0.31, 0.32]})
df2 = pd.DataFrame({'Date': dates*2, 'change': list('zxcv'),'Ticker': list('AABB')})
Actual soulton:
df3 = pd.melt(df1, id_vars=['Date'], value_vars=df2.Ticker.unique(), var_name='Ticker', value_name='Px')
df_out = pd.merge(df2, df3, on=['Date','Ticker'], how='inner')
In the df3
function melt unpivot df1
from wide to long format.
df_out
merged df1
and df3
together on columns Date
and Ticker
.
df1:
Date | A | B |
---|---|---|
27/04/2023 00:00 | 55.6 | 0.31 |
28/04/2023 00:00 | 57.2 | 0.32 |
df2:
Date | change | Ticker |
---|---|---|
27/04/2023 00:00 | z | A |
28/04/2023 00:00 | x | A |
27/04/2023 00:00 | c | B |
28/04/2023 00:00 | v | B |
df3:
Date | Ticker | Px |
---|---|---|
27/04/2023 00:00 | A | 55.6 |
28/04/2023 00:00 | A | 57.2 |
27/04/2023 00:00 | B | 0.31 |
28/04/2023 00:00 | B | 0.32 |
df_out
Date | change | Ticker | Px |
---|---|---|---|
27/04/2023 00:00 | z | A | 55.6 |
28/04/2023 00:00 | x | A | 57.2 |
27/04/2023 00:00 | c | B | 0.31 |
28/04/2023 00:00 | v | B | 0.32 |