Search code examples
azure-data-factorydatabricksazure-databricksazure-git-deploymentdatabricks-repos

Running a databricks notebook connected to git via ADF independent from git username


In our company for orchestrating of running Databricks notebooks, experimentally we learned to connect our notebooks (affiliated to a git repository) to ADF pipelines, however, there is an issue.

As you can see in the photo attached to this question path to the notebook depends on the employee username, which is not a stable solution at production.

What is/are the solution(s) to solve it?.

  • update: The main issue is keeping employee username out of production to avoid any future failure. Either in path of ADF or secondary storage place which can be read by lookup but still sitting production side.

Path selection in ADF: enter image description here

enter image description here

enter image description here

enter image description here


Solution

  • If you want to avoid having the username in the path, then you can just create a folder inside Repos, and do checkout there (here is full instruction):

    • In the Repos, in the top-level part, click on the near the "Repos" header, select "Create" and select "Folder". Give it some name, like, "Staging":

    enter image description here

    • Create a repository inside that folder

    Click on the near the "Staging" folder, and click "Create" and select "Repo":

    enter image description here

    After that you can navigate to that repository in the ADF UI.

    It's also recommended to set permissions on the folder, so only specific people can update projects inside it.