Search code examples
pandasdataframepandas-groupbymulti-indexyfinance

Multi-index dataframe split and stack


When I download data from yfinance, I get 8 columns (Open, High, Low, etc...) per ticker. Since I am downloading 15 tickers, I have 120 columns and 1 index column (date). They add up horizontally. See image 1

Instead of having that many columns, in 2 levels, I want just the 8 unique columns. Plus creating one new column that identifies the ticker. See Image 2.

Image 1: Current Form

Image 1 but in raw text:

    Adj Close   ... Volume
DANHOS13.MX FCFE18.MX   FHIPO14.MX  FIBRAHD15.MX    FIBRAMQ12.MX    FIBRAPL14.MX    FIHO12.MX   FINN13.MX   FMTY14.MX   FNOVA17.MX  ... FIBRAPL14.MX    FIHO12.MX   FINN13.MX   FMTY14.MX   FNOVA17.MX  FPLUS16.MX  FSHOP13.MX  FUNO11.MX   FVIA16.MX   TERRA13.MX
Date                                                                                    
2015-01-02  26.065336   NaN 18.526043   NaN 16.337654   18.520781   14.683501   11.301384   9.247743    NaN ... 338697  189552  148064  57  NaN NaN 212451  2649823 NaN 1111343
2015-01-05  24.670488   NaN 18.436762   NaN 15.857328   17.859756   13.795850   11.071105   9.209846    NaN ... 449555  364819  244594  19330   NaN NaN 491587  3317923 NaN 1255128

Image 2: Desired outcome

The code Im applying is:

start = dt.datetime(2015,1,1)
end = dt.datetime.now()

df = yf.download("FUNO11.MX FIBRAMQ12.MX FIHO12.MX DANHOS13.MX FINN13.MX FSHOP13.MX TERRA13.MX FMTY14.MX FIBRAPL14.MX FHIPO14.MX FIBRAHD15.MX FPLUS16.MX FVIA16.MX FNOVA17.MX FCFE18.MX", 
                start = start,
                end = end,
                group_by = 'Ticker',
                actions = True)

Solution

  • I will download the data a little differently:

    import yfinance as yf
    from datetime import datetime as dt
    from dateutil.relativedelta import relativedelta
    
    start = dt(2015,1,1)
    end = dt.now()
    symbols = ["FUNO11.MX", "FIBRAMQ12.MX", "FIHO12.MX", "DANHOS13.MX", "FINN13.MX", "FSHOP13.MX", "TERRA13.MX", "FMTY14.MX",
               "FIBRAPL14.MX", "FHIPO14.MX", "FIBRAHD15.MX", "FPLUS16.MX", "FVIA16.MX", "FNOVA17.MX", "FCFE18.MX"]
    
    data = yf.download(symbols, start=start, end=end, actions=True)
    

    And then Option 1:

    def reshaper(symb, dframe):
        df = dframe.unstack().reset_index()
        df.columns = ['variable','symbol','Date','Value']
        df = df.loc[df.symbol==symb,['Date','variable','Value']].pivot_table(index='Date', columns='variable', values='Value').reset_index()
        df.columns.name = ''
        df['Ticker'] = symb
        return df
    
    
    h = pd.DataFrame()
    
    for s in symbols:
        h = h.append(reshaper(s, data), ignore_index=True)
        
    h
    

    enter image description here

    Option 2: For a one-liner, you could do this:

    data.stack().reset_index().rename(columns={'level_1':'Ticker'})
    

    enter image description here