I am writing my first pip package, but I have trouble with relative paths. The package structure is as follows:
.
├── packname
│ ├── __init__.py
│ ├── packfile1.py
│ ├── packfile2.py
│ └── packfile3.py
│
├── datatoload
│ ├── toload1.pkl
│ ├── toload2.pkl
│ ├── toload3.pkl
│ └── toload4.pkl
│
└── requirements.txt
Some python files in the packname
directory need to load data from files in the datatoload directory. I have some questions about managing package files and data.
Is it ok to have a separate folder for the data to load?
Since I want people to use my package, should I add some properties to my package (I read something about __file__
and __path__
)?
Moreover, do you have any more advice about this?
Thank you :)
UPDATE A user in the comments told me that the folder needs to be inside the package folder, as follows:
.
├── packname
│ ├── __init__.py
│ ├── packfile1.py
│ ├── packfile2.py
│ │── packfile3.py
│ │
│ └─ datatoload
│ ├── toload1.pkl
│ ├── toload2.pkl
│ ├── toload3.pkl
│ └── toload4.pkl
│
└── requirements.txt
The most important question I want to ask is: how do I setup the relative path to be used inside the package? For example, if I want to load data saved in toload2.pkl
from a function in packfile3.py
, can I simply do
load('./datatoload/toload2.pkl')
Would this work when someone downloads my package (together with the datatoload
folder)?
Is it ok to have a separate folder for the data to load?
No, it must be inside the package to avoid polluting installation directory.
…__file__ and __path__…
No need, Python adds these variables automatically on import.
load('./datatoload/toload2.pkl')
Would this work when someone downloads my package…?
No because ./
means the current directory and the current directory for user could be anything. You need to calculate you package directory using os.path.dirname(__file__)
. See https://stackoverflow.com/a/56843242/7976758/ for an example.