Search code examples
databaseetlbusiness-intelligenceapache-superset

Can a Superset dataset use multiple data sources?


I'm evaluating different BI solutions and I have a specific requirement.

Our setup has multiple DS with the same schema, e.g. Customer1DB, Customer2DB, etc.

Can multiple DBs be ingested in the same Superset dataset?


Solution

  • Historically Superset has not supported this. There are a few discussions of this on the project Github, here's one.

    As of 4.0.0 you can enable the ENABLE_SUPERSET_META_DB feature flag to try out this new feature, still in testing. It is documented here: https://superset.apache.org/docs/configuration/databases/#querying-across-databases

    Two other workarounds for combining multiple DBs are:

    • Do the joins in a database suited to this operation, like Trino or Drill, then use this single data source in Superset
    • Someone in the thread linked above says they got this working in Superset by linking the database servers

    Superset does support joining tables from the same database, they get combined into a new virtual dataset via Superset's SQL Lab. And it can connect to multiple databases and use them in different charts. It just can't join across them for a single chart.