I am trying to access a model file I had previously copied over via CLI by using the following code in a notebook at https://community.cloud.databricks.com/
with open("/dbfs/cat_encoder.joblib", "rb") as f:
lb_category = joblib.load(f)
For this I get
FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/cat_encoder.joblib'
I had copied the file using the CLI as I said by running:
dbfs cp cat_encoder.joblib dbfs:/cat_encoder.joblib
Then doing
databricks fs ls "dbfs:/"
I see the file which I had copied.
But if I were to do this in my notebook:
os.chdir('/dbfs')
print(os.listdir('.'))
I see an empty directory instead of the folder and files I see if I were using the UI or the CLI.
If I were to do a write into this empty directory from the notebook, yes that works and I would see exactly one file in that directory the file I had just written, the problem is I want to read what I had already put there beforehand.
It looks as if the local api cannot see what the proverbial other hand is doing with all the datasets and models I have loaded either by CLI or by UI. So why can I not see these files? Does it have something to do with credentials, and if so how do I resolve that? Or is likely something entirely else like maybe mounting? I am doing an introductory trial and some basic stuff on my own to learn databricks, so I am not too familiar with the underlying concepts.
This is a behavior change in the Databricks Runtime 7.x on the Community Edition (and only there) - the dbfs:/
files aren't available anymore via /dbfs/...
. If you want to access that DBFS file locally then you can use dbutils.fs.cp('dbfs:/file', 'file:/local-path')
(or %fs cp ...
) to copy file from DBFS to local file system where you can work with it.