Search code examples
pythongitpypi

What is the best way to package large filesets with a PyPi release?


I'm currently working on a PyPi package that utilizes ~350 MB of audio files in conjunction with a Tkinter GUI. Since I cannot upload these audio files to PyPi due to the 50 MB limit, I resorted to utilizing a creative approach to getting the files on the end-user's installation.

I've created a function audio_download(), which executes the following code using GitPython:

import os
from git import Repo
def audio_download():
    path = os.path.join(os.path.split(os.path.realpath(__file__))[0],'audio')
    git_link = GIT_LINK_HERE
    if not os.path.isdir(path):
        print('Cloning audio files to the package directory...')
        Repo.clone_from(git_link,path)
        print('Audio engine downloaded to {}'.format(path))
    else:
        print('Audio files already present in {}. If you want to uninstall, you must manually delete the folder.'.format(path))

Simply put, it checks the package directory for the audio file folder, and if it doesn't exist, it clones a GitHub repo containing those files.

Three questions:

  1. Is this safe, or is it bad practice to manipulate packages installed using pip this way?
  2. Is there a better method to do this on all platforms JUST using Python? GitPython requires git to be installed, and I want to make installation as painless as possible for even the most inexperienced Python users.
  3. If I wanted to allow the user to download the files to the folder of their choosing (instead of some deeply nested Python package), what is the best way to track the folder within my PyPi package? Simply write a file with the path to the folder, or something more elegant?

Solution

  • You could put the large audio files into the distributions with package_data (assuming you're using setuptools). But it's not really ideal to have such big files, probably you'd have to sort this out with PyPI. It's probably better to ask the user (via a GUI dialog for example) to trigger the download after the first start of the application and store those in the user's application directory (or let the user choose a target directory, via the GUI dialog again).