Search code examples
linuxdebianpackage-managersdpkg

For any Linux Distro's packaging system, find the maximum number of simultaneously installable packages


Some packages conflict, so it's impossible to install all available packages at once. What's the maximum possible number of installable packages for a given system? A brute-force trial and error method would be:

  1. make a list of every possible of package name, i.e. dglob -a > list
  2. from that, make sublists slist1 slist2 slist3 ... of every possible package combination. On my system dglob -a | wc -l returns 91327, which would require an unfeasibly large number (1.467×10^27492) of files.
  3. run apt-get install on each list, and rm those that produce conflicts.
  4. sort remaining lists by number of lines, and show the longest one. wc -l slist* | head -n -1 | sort -g | tail -1.

Easy, but too heavy on resources, so maybe there's some more feasible method.

Various related questions follow from that, such as:

  • given a package 'foo', how to find the maximum number of installable packages that don't conflict with 'foo'?

  • For all possible packages, which has the smallest such maximum, (making it the most "quarrelsome" package)?

(Note: The question applies to Debian, Red Hat, and any distro or OS with a packaging system that maps out package conflicts. Answers for any applicable platform would be valid.)


Background:

Debian has tens of thousands of packages. dglob (from the debian-goodies package) is handy for getting a quick count:

# show how many packages installed and available on the current system
echo $(dglob    | wc -l) packages installed of \
     $(dglob -a | wc -l) packages.

Example output, (both numbers may periodically fluctuate after updates and upgrades, and will vary between systems):

5107 packages installed of 91327 packages.

The number 5107's not a maximum of course, but there must be a maximum.


Solution

  • The brute force option is the only option in this case. Here is a paper that will describe why in depth but the issue is that package installation and dependency/conflict resolution is a NP-Complete problem.

    A problem is in NP if every TRUE answer has an easily-checked polynomial-size explanation. In this case this can be done by listing the installed packages and available packages.

    Debian package installation is in NP-hard if an efficient solution for the problem can be adapted to efficient solutions to every other problem in NP. I will defer to the above listed paper as it is a bit complex to prove here but it can be encoded as 3-SAT.

    As Debian package installation is in NP and NP-hard, thus it is NP-complete.

    Here are some ways the default solver in APT tries to avoid the NP-completeness:

    • Use of heuristics
    • A preference for the first element in an or-group
    • A strict package version convention
    • Giving up when it encounters a major conflict.

    Basically the constraints have to be specifically designed to get the problem to fall into known tractable classes for a NP-Complete problem like HORN-SAT

    Unfortunately find the maximum possible number of installable packages for a given system pretty much excludes all of the known tractable classes that I am aware of.

    So brute force is the only option and an expensive one until a suitable tractable class is discovered or if someone proves that P=NP.