Search code examples
gitsvngit-svngit-branchgit-clone

How can I import SVN with inconsistent branching structure into Git?


I apologize if this question has been asked before, but my case is a relatively specific one. I have been at my company for some time and using SVN, but desired recently to move to Git, for various reasons.

The issue I'm having right now is that my company uses a non-standard branching structure, and unfortunately, at times in the past, it hasn't even been a consistent non-standard branching structure.

The history I'm aware of, since joining the company, is that we use one main trunk branch, from which we create release branches and feature branches. The structure of these branches isn't simply a standard trunk/branches/tags structure, however. We have several subfolders for different types of branches. For instance, release branches go in branches_release, feature branches in branches_feature, etc, like the following:

branches_feature/featureA
branches_release/2.0

I figured out how to make this clone/fetch work properly by modifying the Git repo's config so that

branches = {branches_feature,branches_release}/*:refs/remotes/branches/*

This has been relatively successful in fetching the appropriate branches. The one issue I'm having is that when my company first started, it used a structure more like:

branches_feature/username/branchname

Unfortunately to find this out (the hard way) I had to "git svn fetch" and find that all of these branches following the older branching convention have been collapsed so that in Git each user has a single branch in which exists every branch made. So,

branches_feature/username/featureA
branches_feature/username/featureB

have been collapsed into:

branches_feature/username

Obviously this is insufficient for a properly reproduced SVN repo history, but I'm not sure how to modify the config's branches line to encapsulate all of these branches AND still use the new branching format properly. I've been trying to manipulate it in various ways, but I wind up either getting errors or simply being unsuccessful in my attempts.

If anyone can suggest a good way to appropriately preserve the SVN repo's history while importing from SVN to Git, I would greatly appreciate it.

Thanks.


Solution

  • TL;DR: For anything except the most trivial of repositories, you'll never be able to fully preserve the contents of a Subversion repository in a git svn repository.

    I'm adapting this from my answer to a similar question.

    By my understanding, your Subversion tree looks something like this, where * indicates a folder that at some point in the Subversion history would have been the root of a working copy:

    /
    |--branches_feature/
    |  |--featureA/       *
    |  |--userB/          *
    |  |  |--featureB/    * (Possibly now deleted, but existed previously)
    |  |  `--featureC/    *
    |  `--userC/          *
    |--branches_release/
    |  |--V1.0/           *
    |  `--V2.0/           *
    `trunk/               *
    

    Sadly, git svn can't cope with a repository like that in a particularly sensible way. You're not going to get a Git repository that has all the branches of your Subversion repository and none that it shouldn't have.

    Your options are thus:

    • Treat both the branches_feature and branches_feature/userB as branch folders.

      You'll end up with some Git branches that, were you to check them out, would give you a bunch of folders each containing a Subversion branch folder, and git svn fetch operations on those folders may take longer, as the fetch will need to be done for both the container branch and the real branch. Because Git is clever, it at least will take up exceedingly little extra disk space.

      I'd expect your .git/config to have lines like the below:

      branches = branches_feature/*:refs/remotes/branches/*
      branches = branches_feature/userB/*:refs/remotes/branches/*
      branches = branches_release/*:refs/remotes/branches/*
      
    • Ignore some branch folders. Just don't tell git svn about them, and continue in merry ignorance.

    • Pick out the branches you're interested in, and pick them up manually. If you want the userB folder, you'll still need to be careful about the history you pick up, though, if its sub-branches have been deleted and you don't want to pick them up.

      Here, I'd expect your .git/config to have a whole load of lines like the below:

      fetch = branches_feature/featureA:refs/remotes/branches/featureA
      fetch = branches_feature/userB/featureB:refs/remotes/branches/featureB
      fetch = branches_feature/userC:refs/remotes/branches/userC
      
    • Patch your version of git svn to somehow allow it to cope with this scenario. Bonus points if you get it included in future official Git releases.