I installed pyspark
using pip3
.
Whenever I try import pyspark
in python3
, I get an error:
import pyspark Traceback (most recent call last): File "<stdin>", line 1, in <module> ModuleNotFoundError:avinash@avinash-HP-ProBook-445-G1:~$ python3 Python 3.7.0 (default, Jun 28 2018, 13:15:42) [GCC 7.2.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information. import pyspark Traceback (most recent call last): File "<stdin>", line 1, in <module> ModuleNotFoundError: No module named 'pyspark'
On the other hand when I use sudo python3
, everything works fine!
A similar thing happens in Jupyter notebook also, I have to do sudo jupyter notebook --allow-root
to import pyspark
However, importing other packages like numpy works fine without sudo too, that too installed with pip3
.
Update: I installed pyspark using sudo pip3 install pyspark
, I tried uninstalling it and then installing it without sudo i.e. pip3 install pyspark
but it gives error:
Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/usr/local/lib/python3.6/dist-packages/pyspark-2.4.0.dist-info' Consider using the --user option or check the permissions.
Strange thing is, there is no file named 'pyspark-2.4.0.dist-info' as mentioned in error, in the directory /usr/local/lib/python3.6/dist-packages/pyspark-2.4.0.dist-info
.
I also tried giving permission(777) to the above-mentioned directory.
Based on error you get, it seems you are using Anaconda
on linux. In such a case you have to install pyspark
using the command below:
conda install -c conda-forge pyspark