I would like to use dbx execute to run a task/job on an azure databricks cluster. However, i cannot make it install my code.
More Details on the situation:
Does anyone know how to configure the pip which is used during dbx execute installation process? Somehow this seems to be ignoring any conf which was set with init scripts.
I searched through lots of documentation such as https://docs.databricks.com/libraries/index.html and https://dbx.readthedocs.io/en/latest/reference/deployment/#advanced-package-dependency-management but with no luck
When i look into dbx package seems not that there is an option to set any pip.conf :( https://github.com/databrickslabs/dbx/blob/main/dbx/commands/execute.py
I raised an issue also in the github repo of dbx. https://github.com/databrickslabs/dbx/issues/669 They pointed me to this link
which explains how to do it.
In short. Overwrite the global pip.conf in /etc/pip.conf in your init.sh
#!/bin/bash
echo """[global]
index-url=https://pypi.org/simple
extra-index-url=https://my.custom.pypi.example.com/simple/
""" > /etc/pip.conf
To make it work with azure devops. I created an azure devops personal access token and adapted extra-index-url looked like this:
https://<anyname>:<token_with_read_package_permissions>@pkgs.dev.azure.com/<organisation>/<project>/_packaging/<feedname>/pypi/simple/
replace all values in <....> with your values. can have any value as the token is enough for authentication