Search code examples
jupyter-notebookamazon-sagemakeramazon-sagemaker-studio

Sagemaker Jupyter Notebook Cannot Access Local Files


I am running notebooks in Sagemaker Studio

When I create a notebook and run it from stage studio, I execute from a directory which corresponds to what I see on the left sidebar

    import os
    print("getcwd", os.getcwd())

getcwd /root/test

However, when I schedule the same notebook using the UI enter image description here

the job executes from /opt/ml/input/data/sagemaker_headless_execution

That directory contains the notebook I am running, but nothing else enter image description here

On my terminal, I can navigate to /home/sagemaker-user/mydirectory but when I do this in the notebook /home is empty

My notebook needs access to certain files stored in the local directory. How do I mount or attach thm?

I can just input and output everything through boto or sqlalchemy, but if so, what is the point of Sagemaker having a file system. It also means the flow which works when the notebook is run from within the UI or locally breaks down when run on a schedule which seems wrong.


Solution

  • Notebook jobs use training jobs in the backend - so you'll have to have any additional files (other than your notebook) in S3 (or other accessible location) to access them in the headless training job. Studio file system is not mounted to the training job.