I am very new to python package development. I developed a package and published it at TestPyPI. I install this package trough pip
with no errors. However, python is giving me a "ModuleNotFoundError" when I try to import it, and I have no idea why. Can someone help me?
First, I install the package with:
pip install -i https://test.pypi.org/simple/ spark-map==0.2.76
Then, I open a new terminal, start the python interpreter, and try to import this package, but python gives me a ModuleNotFoundError
:
>>> import spark_map
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'spark_map'
When I cd
to the root folder of the package, and open the python interpreter, and run import spark_map
, it works fine with no errors;
That pip
did not installed the package succesfully; However I checked this. I got no error messages when I install the package, and when I run pip list
after the pip install
command, I see spark_map
on the list of installed packages.
> pip list
... many packages
spark-map 0.2.76
... more packages
spark_map
was installed can be out of the module search path of Python; I checked this as well. pip
is installing the package on a folder called Python310\lib\site-packages
, and this folder is included inside the sys.path
variable:>>> import sys
>>> for path in sys.path:
... print(path)
C:\Users\pedro\AppData\Local\Programs\Python\Python310\python310.zip
C:\Users\pedro\AppData\Local\Programs\Python\Python310\DLLs
C:\Users\pedro\AppData\Local\Programs\Python\Python310\lib
C:\Users\pedro\AppData\Local\Programs\Python\Python310
C:\Users\pedro\AppData\Local\Programs\Python\Python310\lib\site-packages
C:\Users\pedro\AppData\Local\Programs\Python\Python310\lib\site-packages\win32
C:\Users\pedro\AppData\Local\Programs\Python\Python310\lib\site-packages\win32\lib
C:\Users\pedro\AppData\Local\Programs\Python\Python310\lib\site-packages\Pythonwin
I am on Windows 10, Python 3.10.9, trying to install and import the spark_map
package, version 0.2.76.(https://test.pypi.org/project/spark-map/).
The package source code is hosted at GitHub, and the folder structure of this package is essentially this:
root
│
├───spark_map
│ ├───__init__.py
│ ├───functions.py
│ └───mapping.py
│
├───tests
│ ├───functions
│ └───mapping
│
├───.gitignore
├───LICENSE
├───pyproject.toml
├───README.md
└───README.rst
The pyproject.toml
file of the package:
[build-system]
requires = ["setuptools>=61.0", "toml"]
build-backend = "setuptools.build_meta"
[project]
name = "spark_map"
version = "0.2.76"
authors = [
{ name="Pedro Faria", email="[email protected]" }
]
description = "Pyspark implementation of `map()` function for spark DataFrames"
readme = "README.md"
requires-python = ">=3.7"
license = { file = "LICENSE.txt" }
classifiers = [
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
]
dependencies = [
"pyspark",
"setuptools",
"toml"
]
[project.urls]
Homepage = "https://pedropark99.github.io/spark_map/"
Repo = "https://github.com/pedropark99/spark_map"
Issues = "https://github.com/pedropark99/spark_map/issues"
[tool.pytest.ini_options]
pythonpath = [
"."
]
[tool.setuptools]
py-modules = []
As @Dorian Turba suggested, I moved the source code into a src
folder. Now, the structure of the package is this:
root
├───src
│ └───spark_map
│ ├───__init__.py
│ ├───functions.py
│ └───mapping.py
│
├───tests
├───.gitignore
├───LICENSE
├───pyproject.toml
├───README.md
└───README.rst
After that, I executed python -m pip install -e .
(the log of this command is on the image below). The package was compiled and installed succesfully. However, when I open a new terminal, in a different location, and try to run python -c "import spark_map"
, I still get the same error.
I also tried to start a virtual environment (with python -m venv env
), and install the package inside this virtual environment (with pip install -e .
). Then, I executed python -c "import spark_map"
. But the problem still remains. I executed pip list
too, to check if the package was installed. The full log of commands is on the image below:
The source of the problem is at the "build process" of the package. In other words, pip install
was installing a "not valid package".
Basically, I use setuptools to build the package. When I compiled (or "build" the package with python -m build
, the source code of the package (that is, all contents of the src
directory), was not included in the compiled TAR archive.
The documentation for setuptools talks about this issue of finding the source code for your project. In essence, setuptools was not finding the source code of the package. So I needed to help him find these files, by adding these two options to my pyproject.toml
file:
[tool.setuptools]
packages = ["spark_map"]
package-dir = {"" = "src"}
If you are having a similar problem at installing and importing your package, you might have this same problem, as I did. To check if that is your case, build your project with python -m build
. Then, open the source distribution of your package (that is, the TAR archive), and check if the source code is there, inside this TAR file. If not, than, you have this exact problem.