Search code examples
databrickspython-importdatabricks-workflows

How to import module python from a parent folder (Databricks Jobs - Python Script)?


Project contains these folders:

project
├── config
|   └── utils.py
└── src
    └── module01
        └── file01.py

In file01.py from config.utils import *

When run in Databricks Jobs with task with file01.py the following error occurs:

ImportError: attempted relative import with no known parent package

Note: This error don`t occur when I run the file in Workspace, only ocurrs with Jobs in Workflow

I tried to run a job in Databricks with python script that import a module from a parent folder. I expected that the import run sucessfully


Solution

  • This is how I was able to solve this issue for a repo that I am not ready to productionize as a module that can be properly installed based on Databrick's best practices (i.e. pip installable module, pyhon wheel file, etc.):

    In the script being run add the following to the top of the file:

    import sys 
    sys.path.append('/Workspace/Workspace/Repos/{repo_account}/{name_of_repo}')
    

    Replacing repo_account with the user the repo is stored under and name_of_repo with the top directory path of the project.

    What helped me in debugging this was importing sys and printing the sys.path to see what the environment being used by the cluster looks like.