Search code examples
pythonazuredatabricksazure-databricks

How to import a local module into azure databricks notebook?


I'm trying to use a module in databrick's notebook but I am completely blocked. I'd like to execute the following command or anything similar which allow my to make instances of MyClass

from mypackage.mymodule import MyClass

Following databrick's documentation I have develop a python package with a single module locally as follows:

mypackage
|- __init__.py
|- setup.py
|- mymodule.py

Then run python setup.py bdist_wheel obtaining a .whl file. The directory ends up being

mypackage
|- build
   |- ... whatever
|- src.egg-info
   |- ... whatever
|- dist
   |- src-0.1-py3-none-any.whl
|- __init__.py
|- setup.py
|- mymodule.py

From here I've uploaded the .whl file into the Workspace following the instructions. But now I'm not able to import MyClass into any notebook.

I've tried all approches below:

  • upload the .whl with and without a name.
  • upload the .whl installing it into the cluster and not.
  • use import mypackage
  • use dbutils.library.install('dbfs:/path/to/mypackage.whl/') (which returns True) and then use import ...
  • instead of uploading a .whl, create the package folder in the same directory as the notebook.
  • Upload to my folder and to Shared folder
  • all combinations of the above. f.ex: uploading with different name and use import differentname

This is driving my crazy. I its such a simple task which I can achive easily with regular notebooks.


Solution

  • I've solved this by using python's egg instead of wheel. python setup.py bdist_egg will create an egg which you can install following databricks docs. I don't know why wheel doesn't work...