Search code examples
pythonpandasgoogle-finance

What column should I assign to parse_dates while working with google finance?


I wrote a code to show a graph from google finance but i got this error:

ValueError: Missing column provided to 'parse_dates': 'Date'

This was my code:

from bokeh.plotting import Figure, output_file, show

data = pandas.read_csv('http://www.google.com/finance/historical?q=NASDAQ:AAPL&startdate=Jan+01%2C+2000&output=csv',parse_dates = ["Date"])

f = figure(height = 800,width = 600,x_axis_type = 'datetime')

f.line(data["Date"],data["Close"],color = "orange",alpha = 0.5)
output_file("History.html")
show(f)

What column should I assign to make my code work?


Solution

  • Your html path 'http://www.google.com/finance/historical?q=NASDAQ:AAPL&startdate=Jan+01%2C+2000&output=csv' does not return a proper csv file and pd.read_csv() can not parse it. Furthermore there is no column named Date.

    Internal pd.read_csv() calls pd_read_html() and this returns

    [                  (USD)  \
    0               Revenue   
    1            Net income   
    2           Diluted EPS   
    3     Net profit margin   
    4      Operating income   
    5    Net change in cash   
    6  Cash and equivalents   
    7       Cost of revenue   
    
      Dec 2021infoFiscal Q1 2022 ended 12/25/21. Reported on 1/27/22.  \
    0                                            123.94B                
    1                                             34.63B                
    2                                               2.10                
    3                                             27.94%                
    4                                             41.49B                
    5                                              2.70B                
    6                                             37.12B                
    7                                             69.70B                
    
      Year/year change  
    0           11.22%  
    1           20.43%  
    2           25.00%  
    3            8.29%  
    4           23.72%  
    5          230.48%  
    6            3.08%  
    7            3.86%  ]
    

    This is not what you want. Other posts are using pandas_datareader.