Search code examples
pandasdataframepandas-datareader

pandas DataReader is not working correctly


I am trying to import data from yahoo finance but the pandas seems to not read correctly the start date and the end day. Also is reporting me an error of pandas that I don't understand

this is the code I put :

import numpy as np
import pandas as pd
from pandas_datareader import data as wb
import matplotlib.pyplot as plt

and this is what appear in the screen but I can still using the pandas

/opt/anaconda3/lib/python3.7/site-packages/pandas_datareader/compat/__init__.py:7: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.
  from pandas.util.testing import assert_frame_equal.

then I ran this code

acciones=["PG","BEI.DE"]
datos= pd.DataFrame()
for t in acciones:
    datos[t]=wb.DataReader(t,data_source="yahoo",start=2016-1-1,end=2019-1-1)["Adj Close"]

and when I check the output date is daelayed by two years I don't know why

datos.tail()

Date        PG             BEI.DE
2016-12-23  76.435783   78.406380
2016-12-27  76.111885   78.726517
2016-12-28  75.635086   78.600410
2016-12-29  75.886978   78.687721
2016-12-30  75.644073   78.192947


datos.head
Date        PG             BEI.DE
2014-01-02  65.854416   68.331200
2014-01-03  65.780823   68.686317
2014-01-06  65.936180   68.405960
2014-01-07  66.573967   68.592857
2014-01-08  65.609123   68.004128

Solution

  • You're getting a warning that FutureWarning: pandas.util.testing is deprecated so you can still run your code, but it may break in the future. This issue has been resolved here

    Instead of using the import statement: from pandas.util.testing import assert_frame_equal use this one instead

    from pandas.testing import assert_frame_equal
    

    Also you should use the datetime library to create your start and end dates so that your dates are the correct type.

    import datetime
    import pandas as pd
    import pandas_datareader.data as wb
    
    start_date = datetime.datetime(2016,1,1)
    end_date = datetime.datetime(2019,1,1)
    acciones=["PG","BEI.DE"]
    datos= pd.DataFrame()
    for t in acciones:
        datos[t]=wb.DataReader(t,data_source="yahoo",start=start_date,end=end_date)["Adj Close"]
    

    Output:

    >>> datos.head()
                       PG     BEI.DE
    Date                            
    2016-01-04  68.264992  78.003090
    2016-01-05  68.482758  78.849281
    2016-01-06  67.820770  78.339645
    2016-01-07  67.228455  76.426102
    2016-01-08  66.174454  76.233788
    

    >>> datos.tail()
                       PG     BEI.DE
    Date                            
    2018-12-24  83.908928        NaN
    2018-12-26  86.531075        NaN
    2018-12-27  88.384834  89.214600
    2018-12-28  87.578018  89.805695
    2018-12-31  88.288788        NaN