Search code examples
pythonjupyter-notebookjupyterremote-accessapache-zeppelin

Interact with Jupyter Notebooks via API


The problem: I want to interact with Jupyter from another application via Jupyter API, in particular I want to run my notebooks from the app at least (Perfect variant for me is to edit some paragraphs before running it). I've read the API documentation but haven't found what I need.

I've used for that purpose Apache Zeppelin which have the same structure (Notebooks and paragraphs).

Does anybody used Jupyter for the purpose I've just described?


Solution

  • Ignoring if the use of Jupyter API is the best solution for the problem (not clearly described in the question), the code below does what you have asked for: it will execute remotely a Jupyter notebook over http and get some results. It is not production ready, it more an example of how it can be done. Did not test it with cells that generate lots of output - think it will need adjustments.

    You can also change/edit the code programmatically by altering the code array.

    You will need to change the notebook_path, base and headers according to your configuration, see code for details.

    import json
    import requests
    import datetime
    import uuid
    from pprint import pprint
    from websocket import create_connection
    
    # The token is written on stdout when you start the notebook
    notebook_path = '/Untitled.ipynb'
    base = 'http://localhost:9999'
    headers = {'Authorization': 'Token 4a72cb6f71e0f05a6aa931a5e0ec70109099ed0c35f1d840'}
    
    url = base + '/api/kernels'
    response = requests.post(url,headers=headers)
    kernel = json.loads(response.text)
    
    # Load the notebook and get the code of each cell
    url = base + '/api/contents' + notebook_path
    response = requests.get(url,headers=headers)
    file = json.loads(response.text)
    code = [ c['source'] for c in file['content']['cells'] if len(c['source'])>0 ]
    
    # Execution request/reply is done on websockets channels
    ws = create_connection("ws://localhost:9999/api/kernels/"+kernel["id"]+"/channels",
         header=headers)
    
    def send_execute_request(code):
        msg_type = 'execute_request';
        content = { 'code' : code, 'silent':False }
        hdr = { 'msg_id' : uuid.uuid1().hex, 
            'username': 'test', 
            'session': uuid.uuid1().hex, 
            'data': datetime.datetime.now().isoformat(),
            'msg_type': msg_type,
            'version' : '5.0' }
        msg = { 'header': hdr, 'parent_header': hdr, 
            'metadata': {},
            'content': content }
        return msg
    
    for c in code:
        ws.send(json.dumps(send_execute_request(c)))
    
    # We ignore all the other messages, we just get the code execution output
    # (this needs to be improved for production to take into account errors, large cell output, images, etc.)
    for i in range(0, len(code)):
        msg_type = '';
        while msg_type != "stream":
            rsp = json.loads(ws.recv())
            msg_type = rsp["msg_type"]
        print(rsp["content"]["text"])
    
    ws.close()
    

    Useful links based on which this code is made (that I recommend reading if you want more info):

    Note that there is also https://jupyter-client.readthedocs.io/en/stable/index.html, but as far as I could tell it does not support HTTP as a transport.

    For reference this works with notebook-5.7.4, not sure about other versions.