Search code examples
pythonpandasubuntu-14.04google-compute-engineyahoo-finance

Trouble with http request from Google Compute Engine


I'm trying to set up a Google Compute Engine server to pull options data using Python Pandas. When I make this request from my Mac at home, I only have problems late at night when Yahoo! is resetting its servers (the data is being pulled from Yahoo! Finance). But when I try doing the same thing from my Compute Engine server, the request always fails for some of the stocks I'm interested in, although it typically works for options on larger companies, such as 'aapl' or 'ge'. On my computer at home, running it at the same time, the same requests succeed for both small and large companies.

The requests do typically take a few seconds, maybe as many as 15. Is there a way to get to more extensive logs as to what is going on when I make these requests on the Google servers? The only things I can think of would be that there are permissions issues for some reason with these specific http requests or that there is a timeout configured that's interfering. But as far as I can tell, the general timeout should be 75 seconds for that kind of request, and there's no way it's taking that long.

Here's a sample of what I see from the python shell:

>>> from pandas.io.data import Options
>>> spwr = Options('spwr', 'yahoo')
>>> data = spwr.get_all_data()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/mnt/disk1/venv/optbot/local/lib/python2.7/site-packages/pandas/io/data.py", line 1090, in get_all_data
    return self._get_data_in_date_range(dates=expiry_dates, call=call, put=put)
  File "/mnt/disk1/venv/optbot/local/lib/python2.7/site-packages/pandas/io/data.py", line 1104, in _get_data_in_date_range
    frame = self._get_option_data(expiry=expiry_date, name=name)
  File "/mnt/disk1/venv/optbot/local/lib/python2.7/site-packages/pandas/io/data.py", line 723, in _get_option_data
    frames = self._get_option_frames_from_yahoo(expiry)
  File "/mnt/disk1/venv/optbot/local/lib/python2.7/site-packages/pandas/io/data.py", line 655, in _get_option_frames_from_yahoo
    option_frames = self._option_frames_from_url(url)
  File "/mnt/disk1/venv/optbot/local/lib/python2.7/site-packages/pandas/io/data.py", line 692, in _option_frames_from_url
    raise RemoteDataError('Received no data from Yahoo at url: %s' % url)
pandas.io.data.RemoteDataError: Received no data from Yahoo at url: http://finance.yahoo.com/q/op?s=SPWR&date=1430438400
>>> aapl = Options('aapl', 'yahoo')
>>> data = aapl.get_all_data()
>>>

I've never yet been successful in getting the options data for 'spwr', but usually it will work for larger companies.

Any ideas how I might fix the issue? Or get to logs that will tell me more about what's happening here?


Solution

  • This is caused by an issue in Pandas 0.15.2. When I reverted back to Pandas 0.15.1, it started working again. The issue has been filed with Pandas. Check there to see if it has been resolved in later releases.