I am using the google finance api to pull data into a pandas dataframe. The index is a number and I would like to change it to be a date inclusive of hours and minutes. Any ideas? Thanks!
import pandas as pd
api_call = 'http://finance.google.com/finance/getprices?q=SPY&i=300&p=1d&f=d,o,h,l,c,'
df = pd.read_csv(api_call, skiprows=8, header=None)
df.columns = ['Record', 'Open', 'High', 'Low', 'Close']
df['Record'] = df.index
Record Open High Low Close
0 0 268.19 268.48 268.18 268.46
1 1 268.14 268.23 267.98 268.19
2 2 268.11 268.19 268.06 268.13
3 3 268.05 268.16 267.96 268.11
4 4 267.93 268.1 267.9 268.06
5 5 267.98 268.01 267.89 267.92
6 6 267.95 267.99 267.86 267.97
7 7 267.88 267.95 267.85 267.94
8 8 267.78 267.9 267.78 267.88
9 9 267.94 267.96 267.68 267.78
10 10 267.91 267.95 267.87 267.94
Doesn't look like Pandas supports reading from the Google api. If you look at the raw response from the api it looks like this:
EXCHANGE%3DNYSEARCA
MARKET_OPEN_MINUTE=570
MARKET_CLOSE_MINUTE=960
INTERVAL=300
COLUMNS=DATE,CLOSE,HIGH,LOW,OPEN
DATA=
TIMEZONE_OFFSET=-300
a1514557800,268.51,268.55,268.48,268.55
1,268.19,268.48,268.18,268.46
2,268.14,268.23,267.98,268.19
3,268.11,268.19,268.06,268.13
That first datetime value (with the leading a
) is the unix timestamp. Each subsequent "datetime" is really the data for next 300 seconds (INTERVAL
value) from the previous row. You need to write something that will parse the header information, and use that to create the timestamps.