Search code examples
azurepowerbiazure-databricksdatabricks-sql

interactive cluster vs SQL warehouse to connect to Power BI


I was reading about how to connect Power BI with Databricks and found this article from Microsoft and one of the requirements is:

An Azure Databricks cluster or Databricks SQL warehouse

So, I would like to know which one is the best option in performance or other reason to choose one of them, also I´m not sure why in the article doesn't mention the serverless option to connect.


Solution

  • In general SQL Warehouse should be better for multiple reasons:

    • It's optimized for BI workloads - uses well tested configurations & instance types to achieve best performance, has results cache, etc.
    • It has Photon enabled by default - on interactive clusters you need to enable it separately with increased price
    • Usually it costs less (depends on what SKU is used), see previous item about Photon
    • You can use Serverless SQL Warehouse that starts almost instantly, and you can set very low auto-termination time, so you won't pay when it's not used.

    I can imagine the benefit of the interactive cluster only in few cases:

    • The amount of data is so small so you can process it with very small instance
    • The interactive cluster is already used for development in non-SQL languages, so you can just use it