Is there any way to get an immutable file ID for a file in repository?
I need an identifier which will survive a file rename. So if there was file Test01.txt
and it was renamed to Test02.txt
(using TortoiseHG rename menu item or the hg rename
command). I want to have some ID which will correspond to Test01.txt
at revision 1 and Test02.txt
at revision 2.
Mercurial does not give any ID to files. This is different from some other systems, such as Bazaar, where each file (and directory) has a unique ID that follows the file throughout it's life time.
The structure in a Mercurial repository is as follows:
So if you add Test01.txt
in revision 0, then you'll have a chain like this
changelog@0 -> manifest@0 -> Test01.txt@0
If you now rename and make a new commit, you will create a new changelog and manifest entry, and create a new filelog for Test02.txt
:
changelog@1 -> manifest@1 -> Test02.txt@0
The new Test02.txt
filelog entry will reference the Test01.txt
entry. This is how Mercurial can keep track of renames:
$ hg debugdata Test02.txt 0
copy: Test01.txt
copyrev: 0936f74a58571dd87ad343cc3d6ae8434ad86fc4
test01
The best "file ID" you can talk about is therefore the ID of the first entry in the original file log. You can dig it out with hg debugindex
:
$ hg debugindex Test01.txt
rev offset length base linkrev nodeid p1 p2
0 0 8 0 0 0936f74a5857 000000000000 000000000000
The "nodeid" column gives you the IDs for the revlog entries in the filelog for Test01.txt
. Here we see that the first revision of the file has ID 0936f74a5857
. This is just a short, 12 character prefix of the full 40 character SHA-1 hash. If you need the full hash, then read on...
The "linkrev" tells you that this version of the file is referenced by changeset 0. You can lookup the data in that changelog entry with hg debugdata -c 0
, but for our purposes the normal hg log
command also has the information:
$ hg log -r 0 --debug
changeset: 0:8e62ecaada0e5ba9efec234d0d9a66583347becf
phase: draft
parent: -1:0000000000000000000000000000000000000000
parent: -1:0000000000000000000000000000000000000000
manifest: 0:0537c846cd545da8f826b9d94fdb2fdae457bd07
user: Martin Geisler <[email protected]>
date: Thu Feb 02 09:00:18 2012 +0100
files+: Test01.txt
extra: branch=default
description:
01
We're interested in the manifest ID. You can now look up the data in the correct manifest entry with:
$ hg debugdata -m 0537c846cd545da8f826b9d94fdb2fdae457bd07
Test01.txt0936f74a58571dd87ad343cc3d6ae8434ad86fc4
There is really a NUL
byte between the file name and the filelog ID, but it's not visible in your terminal. You now have the full filelog ID for the first revision of the Test01.txt
file.
You also need to go from Test02.txt
to Test01.txt
. You can use hg log --follow
and hg debugrename
for this: use hg log
to get the revisions concerning the file, and use hg debugrename
to see what the file was renamed from in each step.