Search code examples
pythonpandasdatetimeindexinggoogle-finance-api

Adding a time index to a pandas dataframe from google finance


I am using the google finance api to pull data into a pandas dataframe. The index is a number and I would like to change it to be a date inclusive of hours and minutes. Any ideas? Thanks!

import pandas as pd
api_call = 'http://finance.google.com/finance/getprices?q=SPY&i=300&p=1d&f=d,o,h,l,c,' 
df = pd.read_csv(api_call, skiprows=8, header=None)
df.columns = ['Record', 'Open', 'High', 'Low', 'Close']
df['Record'] = df.index

Record  Open    High    Low Close
0   0   268.19  268.48  268.18  268.46
1   1   268.14  268.23  267.98  268.19
2   2   268.11  268.19  268.06  268.13
3   3   268.05  268.16  267.96  268.11
4   4   267.93  268.1   267.9   268.06
5   5   267.98  268.01  267.89  267.92
6   6   267.95  267.99  267.86  267.97
7   7   267.88  267.95  267.85  267.94
8   8   267.78  267.9   267.78  267.88
9   9   267.94  267.96  267.68  267.78
10  10  267.91  267.95  267.87  267.94

Solution

  • Doesn't look like Pandas supports reading from the Google api. If you look at the raw response from the api it looks like this:

    EXCHANGE%3DNYSEARCA
    MARKET_OPEN_MINUTE=570
    MARKET_CLOSE_MINUTE=960
    INTERVAL=300
    COLUMNS=DATE,CLOSE,HIGH,LOW,OPEN
    DATA=
    TIMEZONE_OFFSET=-300
    a1514557800,268.51,268.55,268.48,268.55
    1,268.19,268.48,268.18,268.46
    2,268.14,268.23,267.98,268.19
    3,268.11,268.19,268.06,268.13
    

    That first datetime value (with the leading a) is the unix timestamp. Each subsequent "datetime" is really the data for next 300 seconds (INTERVAL value) from the previous row. You need to write something that will parse the header information, and use that to create the timestamps.