Search code examples
gitversion-controlmercurialdvcsbazaar

Is there any distributed revision control system that supports partial checkout/clone?


As far as I know all distributed revision control systems require you to clone the whole repository. For this reason is it not wise to put huge amounts of content into one single repository (thanks for this answer). I know that this a not a bug but a feature, but I wonder whether this is a requirement for all distributed revision control systems.

In distributed rcs the history of a file (or a chunk of content) is a directed acyclic graph, so why can't you just clone this single DAG instead of the set of all graphs in the repository? Maybe I miss something but the following use-cases are hard to do:

  • clone only a part of a repository
  • merge two repositories (preserving their histories!)
  • copy some files with their history from one repository to another

If I reuse parts of other people's code from multiple projects I cannot preserve their full history. At least in git I can think of a (rather complex) workaround:

  1. clone a full repository
  2. delete all content that I am not interested in
  3. rewrite the history to delete everything that is not in the master
  4. merge the remaining repository into an existing repository

I don't know if this is also possible with Mercurial or Bazaar but at least it is not easy at all. So is there any distributed rcs that supports partial checkout/clone by design? It should support one simple command to get a single file with its history from one repository and merge it into another. This way you would not need to think about how to structure your content into repositories and submodules but you could happily split and merge repositories as needed (the extreme would be one repository for each single file).


Solution

  • As of version 2.0, it is not possible to make a so-called "narrow clone" with Mercurial, that is, a clone where you only retrieve data for a specific sub-directory. We call it a "shallow clone" when you only retrieve part of the history, say, the last 100 revisions.

    As you say, there is nothing in the common DAG-based history model that excludes this feature and we have been working on it. Peter Arrenbrecht, a Mercurial contributor, has implemented two different approaches for narrow clones, but neither approach has been merged yet.

    Btw, you can of course split an existing Mercurial repository into pieces where each smaller repository only has the history for a specific sub-directory of the original repository. The convert extension is the tool for this. Each of the smaller repositories will be unrelated to the bigger repository, though — the tricky part is to make the splitting seamless so that the changesets keep their identities.