Search code examples
python-3.xpandasdataframequantitative-finance

Pandas: How to take a certain column from a multi-indexed dataframe


I have a Pandas DataFrame of Open, High, Low, Close, Volume for several stocks.

I would like to take only the Close Column for each of the Stock Tickers and create a second separate DataFrame for that - struggling with the Multi-Indexing syntax and understanding; any help would be greatly appreciated! I would like to keep the Data DataFrame untouched for say, CandleStick charts.

import ...
tickers = ['AAPL', 'MSFT', 'INTC', 'AMZN', 'GS', '^GSPC', 'SPY', '^VIX']
data = yf.download(tickers=tickers, start='2010-01-01', end='2020-01-01',
               interval='1d',
               group_by='ticker',
               auto_adjust=True,  # auto adjusts OHLC
               prepost=True,  # download pre/post market hours data
               threads=True,  # use threads for mass downloading?
               proxy=None
               )

Many thanks,

On a separate note, as you can see in the Excel output, the date index contains the timestamp "00:00:00"- anyway to remove that within the DataFrame and/or for Excel output? - no need to spend too much time worrying about it, just a thought.

Excel Representation of first 15 rows and some of the stocks


Solution

  • Use the advanced xs method to select from the deeper levels of a MultiIndex.

    data.xs('Close', level=1, axis=1)
    #                    AMZN       ^VIX         SPY        ^GSPC        MSFT          GS       INTC        AAPL
    # Date                                                                                                      
    # 2010-01-04   133.899994  20.040001   92.246048  1132.989990   24.294369  149.746597  15.251445   26.538483
    # 2010-01-05   134.690002  19.350000   92.490204  1136.520020   24.302216  152.394012  15.244140   26.584366
    # 2010-01-06   132.250000  19.160000   92.555328  1137.140015   24.153070  150.767426  15.193007   26.161509
    # 2010-01-07   130.000000  19.059999   92.946060  1141.689941   23.901886  153.717728  15.046927   26.113146
    # 2010-01-08   133.520004  18.129999   93.255348  1144.979980   24.066734  150.810715  15.214921   26.286753
    # ...                 ...        ...         ...          ...         ...         ...        ...         ...
    # 2019-12-24  1789.209961  12.670000  319.352142  3223.379883  156.951309  228.512817  59.118862  283.596924
    # 2019-12-26  1868.770020  12.650000  321.052124  3239.909912  158.237793  229.804916  59.526852  289.223602
    # 2019-12-27  1869.800049  13.430000  320.972565  3240.020020  158.527008  229.258255  59.785580  289.113831
    # 2019-12-30  1846.890015  14.820000  319.202972  3221.290039  157.160736  228.403488  59.327831  290.829773
    # 2019-12-31  1847.839966  13.780000  319.978424  3230.780029  157.270432  228.532684  59.556702  292.954712