Search code examples
gitdatabricksdatabricks-workflows

Can you add Databricks jobs to a Git repo?


I'm trying to add databricks jobs to a git repo. I see you're able to run notebooks from a git repo in a job but I don't know if it's possible for the job itself to be added to a git repo.


Solution

  • It's still not possible to "natively" save job definition into Git, but it could be done different ways:

    • Using Databricks Terraform provider's databricks_job resource (doc). The biggest advantage of using it is that it allows to handle dependencies to other resources, like, existing clusters, DLT pipelines, etc. But it requires familiarity with Terraform. You can also export existing job(s) with dependencies using the Terraform Exporter functionality (doc) using -match option to export selected job(s).
    • Using Databricks Asset Bundles - this is relatively new functionality of new Databricks CLI that allows to describe a job plus resources using YAML file, and then deploy to workspace(s). See this product tour & DAIS 2023 presentation for more details.
    • Export Databricks Job definition as JSON from UI, and then use that JSON definition with Databricks CLI or REST API. This method is most complex when you need to deploy a job with dependencies to other resources, so it's not recommended until you know what you're doing.