Search code examples
pythonlinuxpip

How to install pip module system-wide on Linux, if the distro's package manager does not provide that module?


I am used to install Python modules system-wide via sudo pip install <package>, but both pip itself and several resources online (e.g. here) suggest to not use sudo - main reason: to not interfere with Python libs the system does depend on.

Ok, so no more sudo pip it is. However, I also don't want to install everything into virtual environments, since we don't want to use them for each of our Python projects. Installing the module via pip but without sudo is also not my preferred way of installing the required modules, since multiple users on that machine need them.

So what is the most "Pythonic" way to install a module system-wide, if sudo pip is not the way to go and also that module is not provided by the Linux distribution's package manager? For example in my case I want to install tensorflow for all users on that machine, but apt under Ubuntu does not provide that package.


Solution

  • I'm currently grappling with this as well, although my use case is slightly different.

    There is a brief discussion with some recommendations here.

    In my case, I'm trying to deploy a Python application that can be used by multiple users, rather than an environment that might be used by, and altered differently by, multiple users. Your case sounds like it might be closer to the latter, where all users want tensorflow, but they may also install additional packages on top, and this might possibly be different for different users.

    I'm leaning towards the approach suggested by Laurie/EpicWink:

    One suggestion I have for multi-user installs is to install packages under a sys-admin controlled prefix (eg --prefix /srv/my-company) and then add that to each user’s PYTHONPATH (eg PYTHONPATH=/srv/my-company/lib/python3.10/site-packages:$PYTHONPATH). For user installs, just use --user or a virtual-env.

    Where I'll handle user PYTHONPATH addition in an upstream deployment/configuration step. Of course, there are drawbacks to this approach as well.

    In your case, it might be reasonable to have instructions for each user to set up and manage their own tensorflow virtualenv, or to give them a start point by creating this for them if you have sys-admin privileges. The main approach would be to provide a virtualenv per user that can span multiple projects so you don't need 10 copies of tensorflow for 10 individual projects.

    I utilize a similar approach, I have an "everything" virtualenv where I install Spyder, numpy, pandas, machine learning tools etc. Things that I use again and again for basic research, analysis and one-off scripts.

    If I need a sandboxed env for specific application development (usually something that is going to be "deployed") I switch to that on a per-project basis.

    Sorry it's not a definitive answer, but I haven't come across one in my travels... Hope this helps somewhat, and if you find a "better" solution, I'd love to hear about it.