Search code examples
pythonpypi

What's the advantage of having requirements point directly to git repository rather than PyPi?


Recently I asked a Python library maintainer if he could upload a new tagged version of his library to PyPI (https://pypi.org/). He stated that PyPI is not so important nowadays [1].

Is is true? Do people (i.e., you) indeed tend not to use PyPI? If so, what do you use instead and why?

The requirements.txt apparently allows you to specify the git repository directly, skipping PyPI by using git+https://github.com/<repo> directly. Though I can't think of any particular advantage over using PyPI.

Are there any graphs to support the statement? Should the statement be true I would expect the amount of downloads from PyPI (per interval) would be decreasing in time. PyPI publishes this kind of data to Google BigQuery. Checked now and the number of downloads are as follows:

202009    1565855136 -- the query was run on 11th of September 2020
202008    5155068175
202007    5409386519
202006    5211181171
202005    5108756961
202004    4812648839
202003    4670947975    
202002    4067963794
202001    4155726766
201912    3867376444
201911    3845881964
201910    3922992929
201909    3492788322    
201908    3374679723
201907    3338326277
201906    2998812162    
201905    3028973146
201904    2436032402
201903    2732697164

Obtained using this query:

SELECT
  SUBSTR(_TABLE_SUFFIX, 1, 6) AS `month`,
  COUNT(1) AS num_downloads
FROM `the-psf.pypi.downloads*`
WHERE
  _TABLE_SUFFIX BETWEEN FORMAT_DATE(
      '%Y%m01', DATE_SUB(CURRENT_DATE(), INTERVAL 18 MONTH))
  AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())
GROUP BY `month`
ORDER BY `month` DESC

I am aware that this question might not have a correct answer and thus might not be following all the standards of questions asked at StackOverflow. However I can't think of better place to ask on this.

[1] https://github.com/lavr/python-emails/issues/139


Solution

  • PyPI is not so important nowadays Is is true?

    Not at all. The advantages of publishing on PyPI:

    1. One can publish wheels at PyPI so pip install package downloads a platform-specific wheel.

    2. PyPI is hosted at a CDN (content delivery network) so downloading from PyPI is fast.

    3a. pip caches downloaded packages so it doesn't re-download packages for every virtual environment. It caches by their original URLs and the URLs for PyPI are stable.

    3b. pip very badly caches cloned repositories. Usually it re-clones a-new the entire repository.