The python-apt
package provides two APIs for accessing the APT cache:
A
Cache
object represents the cache used by APT which contains information about packages. The object itself provides no means to modify the cache or the installed packages, see the classesDepCache
andPackageManager
for such functionality.
The APT cache file contains a hash table mapping names of binary packages to their metadata. A
Cache
object is the in-core representation of the same. It provides access to APTs idea of the list of available packages.
It is unclear to me why they would contain different sets of packages, but indeed they do:
import apt, apt_pkg
cache = apt_pkg.Cache(apt.progress.base.OpProgress())
cache_pkgs = set(pkg.get_fullname() for pkg in cache.packages)
aptcache = apt.Cache(apt.progress.base.OpProgress())
aptcache_pkgs = set(pkg.fullname for pkg in aptcache)
print(len(cache_pkgs), len(aptcache_pkgs))
# on my system, this outputs: 92488 64447
Though it appears that the latter is a subset of the former:
print(aptcache_pkgs - cache_pkgs)
# on my system, this outputs: set()
Some scripts like this one from Ubuntu will use both, like this:
# we need another cache that has more pkg details
with apt.Cache() as aptcache:
for pkg in cache.packages:
aptcache[pkg.get_fullname()]
What is the distinction between these two methods of accessing the APT cache and why do they return different sets of packages?
Answer from Julian Andres Klode, a maintainer of the project:
apt.Cache
only includes real packages, the one inapt_pkg
also has virtual packages. You can see that inapt/cache.py
how it filters theapt_pkg.Cache
.