I'm getting "permission denied" errors when using MLflow 2.1.8 with Databricks runtime 11.3.
It looks like it is trying to write to /tmp
, which you can't do in Databricks.
I tried setting MLFLOW_DFS_TMP
in the environment, but this seems not to do anything.
It looks like later versions of Databricks like DBR 13 have support for setting the temporary directory, but I'm stuck on DBR 11 for other reasons.
I tracked this down in the MLflow source.
According to the current source as of 2023-11-30, MLflow is catching all exceptions when looking for the temp directory support on Databricks:
try:
return _get_dbutils().entry_point.getReplLocalTempDir()
except Exception:
pass
It falls back to using /tmp
when the method getReplLocalTempDir
is not found on entry_point
, throwing an exception.
I monkey-patched the databricks runtime from my notebook:
from mlflow.utils.databricks_utils import _get_dbutils
def fake_tmp():
return '/dbfs/...' # something writable
_get_dbutils().entry_point.getReplLocalTempDir = fake_tmp
This allows mlflow.pyfunc.log_model()
to run in the old Databricks runtime. It's not necessary for DBR 13, but I had to run on DBR 11 for reasons.