Search code examples
azureazure-devopsdatabricksazure-databricksdatabricks-cli

How to export files generated to Azure DevOps from Azure Databricks after a job terminates?


We are using Azure DevOps to submit a Training Job to Databricks. The training job uses a notebook to train a Machine Learning Model. We are using databricks CLI to submit the job from ADO.

In the notebook, in of the steps, we create a .pkl file, we want to download this to the build agent and publish it as an artifact in Azure DevOps. How do we do this?


Solution

  • It really depends on how that file is stored:

    1. If it just saved on the DBFS, you can use databrics fs cp 'dbfs:/....' local-path
    2. if file is stored on the local file system, then copy it to DBFS (for example, by using dbutils.fs.cp), and then use the previous item
    3. if the model is tracked by MLflow, then you can either explicitly export model to DBFS via MLflow API (or REST API) (you can do it to DevOps directly as well, just need to have correct credentials, etc.) or use this tool to export models/experiments/runs to local disk