Search code examples
anacondaconda

Conda command to list size of packages


I would like to find out the size of conda packages to delete the huge and seldom used ones. Which conda command should I use to find out the package size?

conda list will list the packages but does not show the package size.

I welcome other methods to find out package size.

I'm using Windows 10.


Solution

  • Mine the conda-meta

    One way to get at this is by mining the JSON metadata files for each package inside an environment's conda-meta/ directory. There are two types of sizes listed:

    • size - total zipped tarball size for the package
    • size_in_bytes - individual unzipped file sizes within a package

    Since you seem interested in total size of a package, let's do the simpler size. This will let us get a quick ranking of packages by their download size.

    Commands

    ## activate the environment of interest
    conda activate foo
    
    ## search all the JSONs for '"size":'
    grep '"size":' ${CONDA_PREFIX}/conda-meta/*.json |\ 
    
      ## sort result
      sort -k3rn |\
    
      ## show only filename
      sed 's/.*conda-meta\///g' |\
    
      ## print with columns
      column -t
    

    Example Output

    jaxlib-0.1.67-py39h6e9494a_0.json:          "size":  38576847,
    scipy-1.6.3-py39h056f1c0_0.json:            "size":  19495906,
    python-3.9.4-h9133fd0_0_cpython.json:       "size":  13160553,
    libopenblas-0.3.15-openmp_h5e1b9a4_1.json:  "size":  9163719,
    numpy-1.20.3-py39h7eed0ac_1.json:           "size":  5732039,
    tk-8.6.10-hb0a8c7a_1.json:                  "size":  3420669,
    openssl-1.1.1k-h0d85af4_0.json:             "size":  1985060,
    sqlite-3.35.5-h44b9ce1_0.json:              "size":  1810221,
    libgfortran5-9.3.0-h6c81a4c_22.json:        "size":  1766473,
    pip-21.1.2-pyhd8ed1ab_0.json:               "size":  1147500,
    libcxx-11.1.0-habf9029_0.json:              "size":  1055976,
    setuptools-49.6.0-py39h6e9494a_3.json:      "size":  972968,
    ncurses-6.2-h2e338ed_4.json:                "size":  901840,
    jax-0.2.14-pyhd8ed1ab_0.json:               "size":  571585,
    llvm-openmp-11.1.0-hda6cdc1_1.json:         "size":  274368,
    readline-8.1-h05e3726_0.json:               "size":  272444,
    xz-5.2.5-haf1e3a3_1.json:                   "size":  233058,
    certifi-2021.5.30-py39h6e9494a_0.json:      "size":  144599,
    ca-certificates-2021.5.30-h033912b_0.json:  "size":  139088,
    tzdata-2021a-he74cb21_0.json:               "size":  123802,
    zlib-1.2.11-h7795811_1010.json:             "size":  104180,
    absl-py-0.12.0-pyhd8ed1ab_0.json:           "size":  98565,
    tqdm-4.61.0-pyhd8ed1ab_0.json:              "size":  81513,
    opt_einsum-3.3.0-pyhd8ed1ab_1.json:         "size":  54494,
    libffi-3.3-h046ec9c_2.json:                 "size":  46425,
    wheel-0.36.2-pyhd3deb0d_0.json:             "size":  31381,
    python-flatbuffers-2.0-pyhd8ed1ab_0.json:   "size":  28606,
    libgfortran-5.0.0-9_3_0_h6c81a4c_22.json:   "size":  19280,
    six-1.16.0-pyh6c4a22f_0.json:               "size":  14259,
    libblas-3.9.0-9_openblas.json:              "size":  11762,
    libcblas-3.9.0-9_openblas.json:             "size":  11671,
    liblapack-3.9.0-9_openblas.json:            "size":  11671,
    python_abi-3.9-1_cp39.json:                 "size":  3921,
    

    The above output shows jaxlib is the largest package, followed by scipy, then the python interpreter itself. In this case, if I wanted to remove jaxlib it would also entail removing jax.

    Notes

    I think the above serves as a first approximation for ranking the packages by size. The size_in_bytes could possibly be more exact, but to be thorough one would need to also consider which individual files are hardlinked, since those actually shouldn't be counted against the package on a per environment level. For them, there is only one copy per system and it gets reused across environments.