Search code examples
pythonfinance

S&P 500 List python script crashes


So I have been following a youtube tutorial on Python finance and since Yahoo has now closed its doors to the financial market, it has caused a few dwelling problems.

I run this code

    import bs4 as bs
import datetime as dt 
import os
import pandas as pd
import pandas_datareader.data as web
import pickle
import requests
from pandas_datareader import data as pdr
import fix_yahoo_finance as yf


def save_sp500_tickers():
    resp = requests.get('https://en.wikipedia.org       /wiki/List_of_S%26P_500_companies')
    soup = bs.BeautifulSoup(resp.text, "lxml")
    table = soup.find('table', {'class':'wikitable sortable'})
    tickers = []
    for row in table.findAll('tr')[1:]:
        ticker = row.findAll('td')[0].text
        tickers.append(ticker)

    with open("sp500tickers.pickle", "wb") as f:
            pickle.dump(ticker, f)

            print(tickers)

    return tickers

# save_sp500_tickers()

def get_data_from_yahoo(reload_sp500=False):

    if reload_sp500:
        tickers = save_sp500_tickers()
    else:
        with open("sp500tickers.pickle", "rb") as f:
            tickers = pickle.load(f)

    if not os.path.exists('stock_dfs'):
        os.makedirs('stock_dfs')

    start = dt.datetime(2000, 1, 1)
    end = dt.datetime(2017, 8, 24)

    for ticker in tickers:
        if not os.path.exists('stock_dfs/{}.csv'.format(ticker)):
            data = pdr.get_data_yahoo(ticker, start, end)
            df.to_csv('stock_dfs/{}.csv'.format(ticker))
        else:
            print('Already have {}'.format(ticker))

    get_data_from_yahoo()

and it crashes with a few erros, not just one. The first error is that I should override pandas data reader.

DeprecationWarning: 
Auto-overriding of pandas_datareader's get_data_yahoo() is deprecated and will be removed in future versions.
Use pdr_override() to explicitly override it.
DeprecationWarning)  

How do I override it? I really don't know how to, I am new to Python - sorry for being a noob.

Then we have this:

    get_data_from_yahoo()
  File "C:\Users\Mehdi\Desktop\Python finance\SP500_List.py", line 36, in get_data_from_yahoo
    tickers = pickle.load(f)

I really dont understand why this happens, because I have checked my code with the Youtuber and they match. So some pointer would be appriciated.

Finally, I have this error:

  EOFError: Ran out of input

which I also don't know what it means.

To add to that, I have installed the 'fix_yahoo_finance' package and tried that with the new code as well and it still does not work.

Any help is appriciated. Thanks :)

Full error list:

C:\Users\Mehdi\AppData\Local\Programs\Python\Python36-32\lib\site-packages\fix_yahoo_finance\__init__.py:43: DeprecationWarning: 
    Auto-overriding of pandas_datareader's get_data_yahoo() is deprecated and will be removed in future versions.
    Use pdr_override() to explicitly override it.
  DeprecationWarning)
Traceback (most recent call last):
  File "C:\Users\Mehdi\Desktop\Python finance\SP500_List.py", line 51, in <module>
    get_data_from_yahoo()
  File "C:\Users\Mehdi\Desktop\Python finance\SP500_List.py", line 36, in get_data_from_yahoo
    tickers = pickle.load(f)
EOFError: Ran out of input
[Finished in 3.1s with exit code 1]
[shell_cmd: python -u "C:\Users\Mehdi\Desktop\Python finance\SP500_List.py"]
[dir: C:\Users\Mehdi\Desktop\Python finance]
[path: C:\Program Files (x86)\Intel\iCLS Client\;C:\Program Files\Intel\iCLS Client\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Intel\WiFi\bin\;C:\Program Files\Common Files\Intel\WirelessCommon\;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files\Intel\Intel(R) Management Engine Components\IPT;%SystemRoot%\system32;%SystemRoot%;%SystemRoot%\System32\Wbem;%SYSTEMROOT%\System32\WindowsPowerShell\v1.0\;C:\Users\Mehdi\AppData\Local\Programs\Python\Python36-32\Scripts\;C:\Users\Mehdi\AppData\Local\Programs\Python\Python36-32\;C:\Users\Mehdi\AppData\Local\Microsoft\WindowsApps;C:\Python36\Scripts;C:\Users\Mehdi\AppData\Roaming\Dashlane\4.8.5.35155\bin\Firefox_Extension\{442718d9-475e-452a-b3e1-fb1ee16b8e9f}\components;C:\Users\Mehdi\AppData\Roaming\Dashlane\4.8.5.35155\ucrt]

Solution

  • You have two mistakes in your code:

    1. In line 22 in function save_sp500_tickers(), instead of this:

      with open("sp500tickers.pickle", "wb") as f:
          pickle.dump(ticker, f)
      

      it should be:

      with open("sp500tickers.pickle", "wb") as f:
          pickle.dump(tickers, f)
      

      So it is tickers instead of ticker

    2. In line 47 in function get_data_from_yahoo(), instead of this:

      if not os.path.exists('stock_dfs/{}.csv'.format(ticker)):
          data = pdr.get_data_yahoo(ticker, start, end)
          df.to_csv('stock_dfs/{}.csv'.format(ticker))
      

      It should be:

      if not os.path.exists('stock_dfs/{}.csv'.format(ticker)):
          data = pdr.get_data_yahoo(ticker, start, end)
          data.to_csv('stock_dfs/{}.csv'.format(ticker))
      

      You need to use data instead of df (in video was used df = web.DataReader(ticker, 'yahoo', start, end) which you change in data = pdr.get_data_yahoo(ticker, start, end) but you forgot to change df.to_csv('stock_dfs/{}.csv'.format(ticker)) to data.to_csv('stock_dfs/{}.csv'.format(ticker)))