Search code examples
gitlabdatabricksazure-databricksgitlab-ci-runner

Databricks and Gitlab integration/automatization


I would like to create some ci/cd in gitlab. I would like to run update all databricks notebooks in Repos with the most current git code (some developers to not use dbx ui, but IDE such as VScode).

I was able to found azure devops integration, with running some

stages:
  - update_dbx_notebooks

update_dbx_notebooks:
  stage: update_dbx_notebooks
  script: |
    -python -m pip install --upgrade databricks-cli
    displayName: 'Install dependencies'
  script:
    -echo "Checking out the $CI_COMMIT_BRANCH  branch"
    -databricks repos update --path "Repos/databricksUser/SL_dataprovider_staging" --branch "$CI_COMMIT_BRANCH"

I have generated token, so I am able to pull/commit from databricks notebooks against gitlab fine. But I think gitlab runner must authenticate against databricks, right? Makes sense to create VM with gitlabrunner on Azure?

Does anyone have experience with gitlab/github integration?


Solution

    • Create your gitlab-runner on Linux machine, add sudo privilegies (or specific only for your pipeline) to gitlab-runner user
    • Start runner, register it with your project
    • Add to home config file with Runner profile ~/.databrickscfg See databricks cli docs (or you can use default, after manual settings)
    • Ofcourse you have to verify your gitlab repository against datbricks.

    Add pipeline to your gitlab-ci.yml

    stages:
      - update_dbx_notebooks
    
    update_dbx_notebooks:
      stage: update_dbx_notebooks
      before_script:
        - echo "Installation of databricks-cli via $USER"
        - sudo python3 -m pip install databricks-cli
      script:
        - echo "Test ls dbx space"
        - databricks workspace ls /Users/username --profile Runner
        - echo "Checking out the $CI_COMMIT_BRANCH  branch"
        - databricks repos update --path "/Repos/username/staging" --branch "$CI_COMMIT_BRANCH" --profile Runner

    voila enter image description here