Search code examples
artifactory

Deleting artifacts from Jfrog artifactory doesn't result in freed disk space


I have Artifactory Jfrog 6.16.0 Pro. I installed plugin artifactCleanup and run it against repositories. It deleted about 500GB. The next step, I delete files from trashcan and it is zero now. The last one I run "Garbage Collection" manually.

Space wasn't freed. In Storage section it shows me the following info:

Binaries Size: 1.67 TB
Artifacts Size: 663.15 GB
Optimization:  257.79%

How can I actually free space after artifacts deletion?


Solution

  • First let's make sure how Artifactory GC works. From the docs:

    When a new file is deployed, Artifactory checks if a binary with the same checksum already exists and if so, links the repository path to this binary. Upon deletion of a repository path, Artifactory does not delete the binary since it may be used by other paths. However, once all paths pointing to a binary are deleted, the file is actually no longer being used. To make sure your system does not become clogged with unused binaries, Artifactory periodically runs a "Garbage Collection" to identify unused ("deleted") binaries and dispose of them from the datastore. By default, this is set to run every 4 hours and is controlled by a cron expression.

    This means that if I store the same 5GB file 100 times, then our artifacts size is 500GB, while our binaries size is still 5GB. This is because Artifactory de-duplicates through checksum-based storage.

    The binaries size should never be more than the artifacts size, quite the opposite, the optimization shouldn't pass 100%. However, this is calculated essentially with what you get running a "df" command, so if GC hasn't run it will show those binaries still there.

    This takes us to your issue, which may not be an issue but an expected behavior also noted in the previously linked docs:

    Unreferenced binaries, (including existing unreferenced binaries or artifacts that were manually deleted from the trashcan), will be deleted during the previous Full GC strategy that runs every 20 GC iterations (configurable, 'artifactory.gc.skipFullGcBetweenMinorIterations=20').

    This tells us that the actual deletion fo binaries will happen only every 20th iteration. Please try to manually trigger GC 20 times; the output of the full GC will be different from the regular one, giving you a summary of what was deleted.

    If that doesn't work look into the permissions for the Artifactory user to make sure it can delete files.