Search code examples
databricksdelta-live-tables

Databricks Delta Live Tables (DLT) file format (notebooks or .py files?)


I noticed that it is possible to write DLT pipelines in both Databricks notebooks and .py files. Is there a recommended approach?


Solution

  • It's really depends on your preferences. If you're writing the code interactively in the Databricks web UI, then notebooks could be more convenient - I often write code as individual functions, so I can test them in the same notebook on a sample data without running the whole pipeline. But if you write code outside of Databricks, then files could be used (although DLT doesn't play well with IDEs due heavy use of annotations, but this should be fixed in the future).

    I really recommend for complex pipelines to split code into Python packages that could be tested without running DLT. You can check this blog post about DevOps practices for DLT.