Search code examples
pandaspandas-datareader

reading into DataFrame instead of Panel


I'd like to read the quotations of several tickers at the same time. I am using:

import numpy as np
import pandas as pd
import pandas_datareader.data as web
import datetime
import matplotlib.pyplot as plt
%matplotlib inline
start = datetime.datetime(2017, 9, 20)
end = datetime.datetime(2017,9,22)
h = web.DataReader(["EWI", "EWG"], "yahoo", start, end)

... and it seems to work. However, the data are read into a panel data structure. If I print variable "h" I get:

<class 'pandas.core.panel.Panel'>
Dimensions: 6 (items) x 4 (major_axis) x 2 (minor_axis)
Items axis: Adj Close to Volume
Major_axis axis: 2017-09-22 00:00:00 to 2017-09-19 00:00:00
Minor_axis axis: EWG to EWI

I'd like:

  1. to "see" the resulting panel values (I'm relatively new to pandas).
  2. is it possible to flatten the panel into a DataFrame? (IMO it is better documented)
  3. If I read the "Adjusted close" for me it would be more than enough. Perhaps reading into DataFrame directly would be easier?

Thank you


Solution

  • I think you need Panel.to_frame for MultiIndex DataFrame:

    #with random data
    
    df = h.to_frame()
    print (df)
                      Adj Close     Close      High       Low      Open    Volume
    major      minor                                                             
    2013-01-01 EWI     0.471435  0.471435  0.471435  0.471435  0.471435  0.471435
               EWG    -1.190976 -1.190976 -1.190976 -1.190976 -1.190976 -1.190976
    2013-01-02 EWI     1.432707  1.432707  1.432707  1.432707  1.432707  1.432707
               EWG    -0.312652 -0.312652 -0.312652 -0.312652 -0.312652 -0.312652
    2013-01-03 EWI    -0.720589 -0.720589 -0.720589 -0.720589 -0.720589 -0.720589
               EWG     0.887163  0.887163  0.887163  0.887163  0.887163  0.887163
    2013-01-04 EWI     0.859588  0.859588  0.859588  0.859588  0.859588  0.859588
               EWG    -0.636524 -0.636524 -0.636524 -0.636524 -0.636524 -0.636524
    

    And then select column:

    s = df['Adj Close']
    print (s)
    major       minor
    2013-01-01  EWI      0.471435
                EWG     -1.190976
    2013-01-02  EWI      1.432707
                EWG     -0.312652
    2013-01-03  EWI     -0.720589
                EWG      0.887163
    2013-01-04  EWI      0.859588
                EWG     -0.636524
    Name: Adj Close, dtype: float64
    
    df1 = df[['Adj Close']]
    print (df1)
                      Adj Close
    major      minor           
    2013-01-01 EWI     0.471435
               EWG    -1.190976
    2013-01-02 EWI     1.432707
               EWG    -0.312652
    2013-01-03 EWI    -0.720589
               EWG     0.887163
    2013-01-04 EWI     0.859588
               EWG    -0.636524
    

    Notice:

    In future Panel will be deprecated.