Search code examples
version-controlmercurialbranchdvcs

Named Branches vs Multiple Repositories


We're currently using subversion on a relatively large codebase. Each release gets its own branch, and fixes are performed against the trunk and migrated into release branches using svnmerge.py

I believe the time has come to move on to better source control, and I've been toying with Mercurial for a while.

There seems to be two schools of though on managing such a release structure using Mercurial. Either each release gets its own repo, and fixes are made against the release branch and pushed to the main branch (and any other newer release branches.) OR using named branches within a single repository (or multiple matching copies.)

In either case it seems like I might be using something like transplant to cherrypick changes for inclusion in the release branches.

I ask of you; what are the relative merits of each approach?


Solution

  • The biggest difference is how the branch names are recorded in the history. With named branches the branch name is embedded in each changeset and will thus become an immutable part of the history. With clones there will be no permanent record of where a particular changeset came from.

    This means that clones are great for quick experiments where you don't want to record a branch name, and named branches are good for long term branches ("1.x", "2.x" and similar).

    Note also that a single repository can easily accommodate multiple light-weight branches in Mercurial. Such in–repository branches can be bookmarked so that you can easily find them again. Let's say that you have cloned the company repository when it looked like this:

    [a] --- [b]
    

    You hack away and make [x] and [y]:

    [a] --- [b] --- [x] --- [y]
    

    Mean while someone puts [c] and [d] into the repository, so when you pull you get a history graph like this:

                [x] --- [y]
               /
    [a] --- [b] --- [c] --- [d]
    

    Here there are two heads in a single repository. Your working copy will always reflect a single changeset, the so-called working copy parent changeset. Check this with:

    % hg parents
    

    Let's say that it reports [y]. You can see the heads with

    % hg heads
    

    and this will report [y] and [d]. If you want to update your repository to a clean checkout of [d], then simply do (substitute [d] with the revision number for [d]):

    % hg update --clean [d]
    

    You will then see that hg parents report [d]. This means that your next commit will have [d] as parent. You can thus fix a bug you've noticed in the main branch and create changeset [e]:

                [x] --- [y]
               /
    [a] --- [b] --- [c] --- [d] --- [e]
    

    To push changeset [e] only, you need to do

    % hg push -r [e]
    

    where [e] is the changeset hash. By default hg push will simply compare the repositories and see that [x], [y], and [e] are missing, but you might not want to share [x] and [y] yet.

    If the bugfix also effects you, you want to merge it with your feature branch:

    % hg update [y]
    % hg merge
    

    That will leave your repository graph looking like this:

                [x] --- [y] ----------- [z]
               /                       /
    [a] --- [b] --- [c] --- [d] --- [e]
    

    where [z] is the merge between [y] and [e]. You could also have opted to throw the branch away:

    % hg strip [x]
    

    My main point of this story is this: a single clone can easily represent several tracks of development. This has always been true for "plain hg" without using any extensions. The bookmarks extension is a great help, though. It will allow you to assign names (bookmarks) to changesets. In the case above you will want a bookmark on your development head and one on the upstream head. Bookmarks can be pushed and pulled with Mercurial 1.6 and have become a built-in feature in Mercurial 1.8.

    If you had opted to make two clones, your development clone would have looked like this after making [x] and [y]:

    [a] --- [b] --- [x] --- [y]
    

    And your upstream clone will contain:

    [a] --- [b] --- [c] --- [d]
    

    You now notice the bug and fix it. Here you don't have to hg update since the upstream clone is ready to use. You commit and create [e]:

    [a] --- [b] --- [c] --- [d] --- [e]
    

    To include the bugfix in your development clone you pull it in there:

    [a] --- [b] --- [x] --- [y]
               \
                [c] --- [d] --- [e]
    

    and merge:

    [a] --- [b] --- [x] --- [y] --- [z]
               \                   /
                [c] --- [d] --- [e]
    

    The graph might looks different, but it has the same structure and the end result is the same. Using the clones you had to do a little less mental bookkeeping.

    Named branches didn't really come into the picture here because they are quite optional. Mercurial itself was developed using two clones for years before we switched to using named branches. We maintain a branch called 'stable' in addition to the 'default' branch and make our releases based on the 'stable' branch. See the standard branching page in the wiki for a description of the recommended workflow.