Search code examples
pythonsetuptoolsdistutilsegg

How to distribute / access data files in Python egg?


I'm writing a Django application that is using pip & virtualenv to manage its development environment.

One of the dependencies, pkgme, comes with many data files which are its "backends" and are configured in its setup.py with data_files=$FOO (rather than package_data).

When pkgme looks for its backends, it looks in os.path.join(sys.prefix, "share", "pkgme", "backends"). This works great when pkgme has been installed normally, and seems to match the documentation but does not work when pkgme is installed as an egg.

There, the data files are installed under $VIRTUAL_ENV/lib/python2.7/site-packages/pkgme-0.1-py2.7.egg/share rather than the expected $VIRTUAL_ENV/share.

Which leaves me with two questions:

  1. Should I be using something other than the os.path.join above to find the data files regardless of whether we are using an egg installation or a traditional system installation? If so, what?
  2. Should I be distributing my data files differently so as to make them more readily available in an egg?

Note that I know about pkgutil.get_data, but would rather not use it. I'm not interested in the contents of these data files, I want to know their location instead, so I can execute them.

My current plan is to do this:

  • Use package_data instead of data_files
  • Change pkgme to look for backends relative to pkgme.__file__ rather than sys.prefix

Solution

  • I ended up doing the following:

    • Changing pkgme to use pkg_resources.resource_filename() to find its own included backends
    • Added an entry point that any backend written in Python can use to publish the location of its own backend scripts
    • Kept the sys.prefix-based check for any backend that don't want to use Python

    The diff can be found here: http://bazaar.launchpad.net/~pkgme-committers/pkgme/trunk/revision/86