Search code examples
pythonapirequestquandl

Quandl data, API call


Recently I am reading some stock prices database in Quandl using API call to extract the data. But I am really confused by the example I have.

import requests

api_url = 'https://www.quandl.com/api/v1/datasets/WIKI/%s.json' % stock
session = requests.Session()
session.mount('http://', requests.adapters.HTTPAdapter(max_retries=3))
raw_data = session.get(api_url)

Can anyone explain that to me?

1) for api_url, if I copy that webepage, it says 404 not found. So if I want to use other database, how do I prepare this api_usl? What does '% stock' mean?

2) here request looks like to be used to extract the data, what is the format of the raw_data? How do I know the column names? How do I extract the columns?


Solution

  • To expand on my comment above:

    1. % stock is a string formatting operation, replacing %s in the preceding string with the value referenced by stock. Further details can be found here
    2. raw_data actually references a Response object (part of the requests module - details found here

    To expand on your code.

    import requests
    #Set the stock we are interested in, AAPL is Apple stock code
    stock = 'AAPL'
    #Your code
    api_url = 'https://www.quandl.com/api/v1/datasets/WIKI/%s.json' % stock
    session = requests.Session()
    session.mount('http://', requests.adapters.HTTPAdapter(max_retries=3))
    raw_data = session.get(api_url)
    
    # Probably want to check that requests.Response is 200 - OK here 
    # to make sure we got the content successfully.
    
    # requests.Response has a function to return json file as python dict
    aapl_stock = raw_data.json()
    # We can then look at the keys to see what we have access to
    aapl_stock.keys()
    # column_names Seems to be describing the individual data points
    aapl_stock['column_names']
    # A big list of data, lets just look at the first ten points...
    aapl_stock['data'][0:10]
    

    Edit to answer question in comment

    So the aapl_stock[column_names] shows Date and Open as the first and second values respectively. This means they correspond to positions 0 and 1 in each element of the data.

    Therefore to access date use aapl_stock['data'][0:10][0] (date value for first ten items) and to access the value for open use aapl_stock['data'][0:78][1] (open value for first 78 items).

    To get a list of every value in the dataset, where each element is a list with values for Date and Open you could add something like aapl_date_open = aapl_stock['data'][:][0:1].

    If you are new to python I seriously recommend looking at the list slice notation, a quick intro can be found here