I am writing code to do analysis of our commits to Hg
, and am comparing my results to TortoiseHg
. I am having trouble understanding the behavior of TortoiseHg
in the case of a merge.
When I select a merge changeset in TortoiseHg
, the list of affected files only shows those files that had conflicts, unless I press the "Show All"
button. At least that appears to be the intent, based on what I can glean from the web, and from observation that the files shown in the list have a double headed arrow if I press the Show All button.
I am attempting to emulate that by diffing each file in the changeset against both parents, and only including the file in my analysis if it differs from both parents. However, I am encountering files that TortoiseHg shows in the description of a merge, but that only differ from one parent. I see that in TortoiseHg
as well - diffing against parent 1 or 2 shows a change, but the other parent doesn't.
I have also tried diffing with the --git
option, to make sure it is not a metadata change I am missing, but that doesn't change the results at all.
To get the information about a changeset I am using:
hg log -v -r <rev> --removed --style xml
I pick up the parents of the merge changeset, and for each file in the merge, do
hg diff -r <parent1> -r <rev> filename
hg diff -r <parent2> -r <rev< filename
And I find that files TortoiseHg
shows in its summary of the merge I report as having merged with no conflicts.
Can anyone shed light on the discrepancy?
Update: I was able to reproduce this with the source code for TortoiseHg itself.
Clone from https://hg01.codeplex.com/tortoisehg Open the repo in tortoiseHg and select rev 12602 (58eb7c70). This is a merge with parents of 12599 (6c716caa) and 12601 (39c95a81).
TortoiseHg shows the file tortoisehg/hgqt/repowidget.py as the only conflicted file in the merge, yet
hg diff -r 12599 -r 12602 tortoisehg/hgqt/repowidget.py
returns nothing, while
hg diff -r 12601 -r 12602 tortoisehg/hgqt/repowidget.py
shows two lines changing.
I think I've figured out what tortoisehg's logic is here (though I haven't checked the source to be sure).
As you've guessed, tortoise shows files changed on both sides of a merge with a double arrow. However, it does not look simply at the diff of the merge to each of its parents (e.g. p1(58eb7c70)::58eb7c70
and p2(58eb7c70)::58eb7c70
). Instead, tortoise finds all changes introduced in the merge, compared the last common ancestor of the two parents.
Let's take the tortoise repo as an example. The graph view of the ancestry of 58eb7c70 is:
Jonathan:tortoisehg $ hg log --graph -r ::58eb7c70 -l 5 --template "{node|short}\n{desc|firstline}\n\n"
o 58eb7c70d501
|\ Merge with stable (noop)
| |
| o 39c95a813105
| | repowidget: show all errors on infobar
| |
| o da7ff15b4b96
| | repowidget: limit infobar error messages to 2 lines of up to 140 chars by default
| |
o | 6c716caa11fd
|\| Merge with stable
| |
| o 48c055ad634f
| | sync: show non-ascii command-line arguments correctly
| |
As you can see, merge 58eb7c70d501 merged two branches of development, with one changeset (p1, 6c716caa11fd) on one side, but two on the other (p2, 39c95a813105, and its parent, da7ff15b4b96). The point where these branches diverged is the last common ancestor of p1 and p2 -- 48c055ad634f.
(The last common ancestor can be found directly with hg log -r "last(ancestor(p1(58eb7c70), p2(58eb7c70)))"
)
Let's look at the changes that were made on those two branches. We'll compare each parent of the merge with the common ancestor:
Jonathan:tortoisehg $ hg status --rev "48c055ad634f::6c716caa11fd"
M .hgtags
M tortoisehg/hgqt/commit.py
M tortoisehg/hgqt/compress.py
M tortoisehg/hgqt/hgemail.py
M tortoisehg/hgqt/postreview.py
M tortoisehg/hgqt/purge.py
M tortoisehg/hgqt/rename.py
M tortoisehg/hgqt/repowidget.py
M tortoisehg/hgqt/revset.py
M tortoisehg/hgqt/run.py
M tortoisehg/hgqt/settings.py
M tortoisehg/hgqt/status.py
M tortoisehg/hgqt/sync.py
M tortoisehg/hgqt/visdiff.py
M tortoisehg/util/cachethg.py
M tortoisehg/util/hglib.py
Jonathan:tortoisehg $ hg status --rev "48c055ad634f::39c95a813105"
M tortoisehg/hgqt/repowidget.py
These are the changes that were actually merged by 58eb7c70d501 -- everything changed on the two branches since they diverged. As you can see, the only file in common between the lists -- the only file that was changed on both branches -- is tortoisehg/hgqt/repowidget.py
, just as you expected. You'll see that this file was changed in da7ff15b4b96, the one changeset that's not a parent of the merge but is still included in the changes merged from the two branches.