Search code examples
pythonpandasyahoo-financeyfinance

Invalid synthax in df while fetching ESG data from Yahoo Finance


Updated post for csv file structure. csv file has the following structure:

Symbol
AAL
AAON
AAPL
ABCB
Tickers does not have quotation marks.

I have a problem with my code in Python. I am trying to download historical ESG data from Yahoo Finance using yfinance and NASDAQ tickers from .csv file. To download ESG data I am using a code below:

import pandas as pd
import yfinance as yf
import time
from random import randint
import yesg
import requests

# Read in your symbols
nasdaq = pd.read_csv('/path/to/file.csv')

# Endpoint(As far as I am concerned endpoint allows to download historical data from Yahoo)
url = "https://query2.finance.yahoo.com/v1/finance/esgChart"

# List of dataframes
dataframes = []

for symbol in nasdaq["ticker_code"]:
    response = requests.get(url, params={"symbol": symbol})
    if response.ok:
        df = pd.DataFrame(response.json()["esgChart"]["result"][0]["symbolSeries"]
        df["symbol"] = symbol
        dataframes.append(df)

df = pd.concat(dataframes)
df["timestamp"] = pd.to_datetime(df["timestamp"], unit="s")

But there is an invalid synthax error in df["symbol"] = symbol. I can't find out, what could be the reason for the error. By the way, everything is okay with a file path in a row, I just wrote here a sample of a path.


Solution

  • You forgot to close ) to pd.DataFram(...) but note you have to use headers as parameter of requests else your request will be forbidden (HTTP 403).

    url = 'https://query2.finance.yahoo.com/v1/finance/esgChart'
    headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0'}
    
    # List of dataframes
    dfs = {}
    
    for symbol in nasdaq['ticker_code']:
        response = requests.get(url, headers=headers, params={'symbol': symbol})
        data = response.json()
        if response.ok:
            # skip symbol if has no ESG data
            try:
                df = pd.DataFrame(data['esgChart']['result'][0]['symbolSeries'])
                dfs[symbol] = df
            except KeyError:
                pass            
    
    df = pd.concat(dfs, names=['symbol']).reset_index(level='symbol')
    df['timestamp'] = pd.to_datetime(df['timestamp'], unit='s')
    

    Output:

    >>> df
       symbol  timestamp  esgScore  governanceScore  environmentScore  socialScore
    0    AAPL 2014-09-01     61.00            62.00             74.00        45.00
    1    AAPL 2014-10-01     60.00            62.00             74.00        45.00
    2    AAPL 2014-11-01     61.00            62.00             74.00        45.00
    3    AAPL 2014-12-01     61.00            62.00             74.00        45.00
    4    AAPL 2015-01-01     61.00            62.00             74.00        45.00
    ..    ...        ...       ...              ...               ...          ...
    91  GOOGL 2022-04-01       NaN              NaN               NaN          NaN
    92  GOOGL 2022-05-01     24.32            11.54              1.66        11.12
    93  GOOGL 2022-06-01       NaN              NaN               NaN          NaN
    94  GOOGL 2022-07-01       NaN              NaN               NaN          NaN
    95  GOOGL 2022-08-01     24.14            11.39              1.66        11.10
    
    [384 rows x 6 columns]