I have this code:
import yfinance as yF
import datetime
import pandas as pd
df = pd.DataFrame({'ID':['1', '2'], 'Ticker': ['AIN', 'TILE'], 'Company':['Albany International', 'Interface']})
def get_returns(tic,com):
df_Stock = yF.download(tickers = tic, period = "max", interval = "1mo", prepost = False, repair = False)
df_Stock[com + ' % Growth'] = df_Stock['High']-df_Stock['Open']
df_Stock[com + ' % Growth'] = (df_Stock[com + ' % Growth'] * 100 )/df_Stock['Open']
return df_Stock[com + ' % Growth']
get_returns('AIN','AIN')
Everything work as expected until this point.
df1 = df.apply(lambda x: get_returns(x.Ticker, x.Company), axis=1)
Here I am trying to use this function
get_returns()
with Pandas apply and lambda on the dataframe df defined above. The desired output is another dataframe df1 which has 3 columns namely
If there are more rows in dataframe df, each company will be a new column in df1 and its monthly return would be the row value for the given time period under the Date column.
This is the output I got when running your code for the following data frame
ID | Ticker | Company | |
---|---|---|---|
0 | 1 | AIN | Albany International |
1 | 2 | TILE | Interface |
Output
Date | 1985-01-01 | 1985-02-01 | 1985-03-01 | 1985-04-01 | 1985-05-01 | 1985-06-01 | 1985-07-01 | 1985-08-01 | 1985-09-01 | 1985-10-01 | ... | 2023-01-01 | 2023-02-01 | 2023-03-01 | 2023-04-01 | 2023-05-01 | 2023-06-01 | 2023-07-01 | 2023-08-01 | 2023-09-01 | 2023-10-01 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | 13.081953 | 3.767988 | 1.609221 | 3.527832 | 3.188085 | 11.953453 | 5.196635 | 0.891469 | 1.352506 | 0.290596 |
1 | inf | inf | inf | inf | inf | inf | inf | inf | inf | inf | ... | 15.577892 | 8.091470 | 5.604387 | 1.231532 | 2.809710 | 27.745666 | 12.328766 | 8.771934 | 3.567984 | 1.530608 |
I believe that the code that you're running creates an entry for each index in the original data frame. So row 0 corresponds to 'AIN' and row 1 corresponds to 'TILE'.
To get it into the desired format you need to transpose df1 and rename the columns based on df.
df1 = df1.T
df1.rename(columns=df['Company'] + ' % Returns', inplace=True)
df1.rename(columns={v: f"Company {k+1} - {v}" for k, v in enumerate(df1.columns)}, inplace=True)
Output:
Date | Company 1 - Albany International % Returns | Company 2 - Interface % Returns |
---|---|---|
1985-01-01 | NaN | inf |
1985-02-01 | NaN | inf |
1985-03-01 | NaN | inf |
1985-04-01 | NaN | inf |
1985-05-01 | NaN | inf |
... | ... | ... |
2023-06-01 | 11.953453 | 27.745666 |
2023-07-01 | 5.196635 | 12.328766 |
2023-08-01 | 0.891469 | 8.771934 |
2023-09-01 | 1.352506 | 3.567984 |
2023-10-01 | 0.290596 | 1.530608 |