Search code examples
apache-superset

Programmatically set source database in Apache Superset


I am running Apache Superset on AWS-ECS to facilitate a connection directly with our RDS. This connection works, but has to be configured manually.

Is there a way to programmatically configure source databases with Apache Superset?

I have tried setting SQLALCHEMY_DATABASE_URI, but that is only for the Superset back-end configuration and settings.


Solution

  • Superset has an API, so you can create/update/delete databases via HTTP requests. Authorization is non-trivial, since you need login first to get a CSRF token and store cookies:

    import requests
    from bs4 import BeautifulSoup
    from yarl import URL
    
    superset_url = URL('https://superset.example.org/')
    
    session = requests.Session()
    
    # get CSRF token
    response = session.get(superset_url / "login/")
    soup = BeautifulSoup(response.text, "html.parser")
    csrf_token = soup.find("input", {"id": "csrf_token"})["value"]
    
    # get cookies
    session.post(
        superset_url / "login/",
        data=dict(username=username, password=password, csrf_token=csrf_token),
    )
    
    # create database
    database = {
        "database_name": "my db",
        "sqlalchemy_uri": "gsheets://",
        ...
    }
    session.post(
        superset_url / "api/v1/database/",
        json=database,
        headers={"X-CSRFToken": csrf_token},
    )