Similar to this link but for mercurial. I'd like to find the files that are most contributing to the size of my mercurial repository.
I intend to use hg convert to create a new, smaller repository. I'm just not sure yet which files are contributing to the repository size. They could be files that have already been deleted.
What is a good way to find these anywhere in the repository history? There are over 20,000 commits. I'm thinking a powershell script, but I'm not sure what the best way to go about this is.
Check hg help fileset
. Something like
hg files "set:size('>1M')"
should do the trick for you. You might need to operate over all revisions, though as it only operates on one revision. In bash I'd try something like
for i in `hg log -r"all()" "set:size('>400k')" --template="{rev}\n"`; do hg files -r$i "set:size('>400k')"; done | sort | uniq
might do the trick. Maybe it can be optimized as it's currently a bit duplication and might run for quite a bit; on the OpenTTD repository with 22000 commits it took on my laptop just short of 10 minutes.
(Also check hg help on templates
, files
and grep