Search code examples
pythonazure-databricks

Databricks: "Numba needs NumPy 1.22" when updating numpy


I'm using Azure Databricks with Python. Among other things, I'm using the umap and numpy modules. When I try import umap.umap_ as umap, I get this error:

ImportError: Numba needs NumPy 1.22 or greater. Got NumPy 1.21.

Ok, so just update numpy, right? I ran !{sys.executable} -m pip install numpy --upgrade and it worked. Numpy now shows version in the module explorer:

Module explorer screenshot - numpy

But when I try the umap import again I get the same error as before, telling me it needs 1.22+. That requirement is met, although it seems like the 1.26 version is not being loaded. I have tried getting the proper version loaded using this:

import pkg_resources
pkg_resources.require("numpy==1.26.0")
import numpy as np

Still no luck. That gives:

VersionConflict: (numpy 1.21.5 (/databricks/python3/lib/python3.9/site-packages), Requirement.parse('numpy==1.26.0'))

So how do I do this?


Solution

  • Yes even though you install numpy==1.26.0 it still shows you numpy==1.21.5.

    enter image description here

    Actually, numba requires numpy<1.26,>1.22.

    So install numpy==1.24.0 in cluster libraries configuration and re-attach the notebook.

    enter image description here

    Now you will get correct numpy module and you can import umap successfully.

    enter image description here

    and

    enter image description here