I need to import delta-spark package into Azure Synapse notebook but I receive error "ModuleNotFoundError: No module named 'pyspark.errors'".
The package is installed (along with pyspark). Imports from pyspark (such as pyspark.sql.functions) work but import of DeltaTable fails on pyspark.errors. Same import works in different environment (Databricks notebook) with same package versions. I tried to use findspark but it solved nothing (probably because I don't have issue importing pyspark but some package under pyspark).
code:
%pip install delta-spark
output (selected):
Requirement already satisfied: delta-spark in /nfs4/pyenv-fee9cba1-d59a-44ab-8b9e-957614b27f23/lib/python3.10/site-packages (3.0.0)
Requirement already satisfied: pyspark<3.6.0,>=3.5.0 in /nfs4/pyenv-fee9cba1-d59a-44ab-8b9e-957614b27f23/lib/python3.10/site-packages (from delta-spark) (3.5.0)
code:
import pyspark.sql.functions as F
from pyspark.sql import Window
from delta.tables import DeltaTable
output (selected):
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In [9], line 10
8 import pyspark.sql.functions as F
9 from pyspark.sql import Window
---> 10 from delta.tables import DeltaTable
...
File ~/cluster-env/env/lib/python3.10/site-packages/delta/exceptions.py:20
17 from typing import TYPE_CHECKING, Optional
19 from pyspark import SparkContext
---> 20 from pyspark.errors.exceptions import captured
21 from pyspark.errors.exceptions.captured import CapturedException
22 from pyspark.sql.utils import (
23 AnalysisException,
24 IllegalArgumentException,
25 ParseException
26 )
ModuleNotFoundError: No module named 'pyspark.errors'
This is the environment setting of the Spark pool:
SPARK_HOME -> /opt/spark
PYTHONPATH -> /opt/spark/python/lib/pyspark.zip<CPS>/opt/spark/python/lib/py4j-0.10.7-src.zip<CPS>/opt/spark/python/lib/pyspark.zip<CPS>/opt/spark/python/lib/py4j-0.10.7-src.zip
from delta.tables import DeltaTable
This command will work in synapse notebook without installing the delta-spark
package.
I got same error when I tried to use from delta.tables import DeltaTable
command after %pip install delta-spark
.
But before installing it, the above is working fine. Installing delta-spark
upon the delta
package in synapse notebook might be the reason for the above error.
To use from delta.tables import DeltaTable
, don't install the delta-spark
. You can directly use it like below.
You can see, it is working fine without installing any package.