After trying several options and a bunch of hints from this site and others I'm stuck. My main question is the following: I'd like to migrate (part of) an SVN repository to Git, preserving history. The SVN layout is non-standard and after git svn clone
I do see the right branches appear, but when I try to e.g. merge master
into a branch, I get conflicts that say both added a set of files. If I take a look in e.g. gitg
I see the branches, but they never seem to branch from master
/trunk (so the "both added" conflicts seem logical from that perspective), nor do I see any of the merges (e.g. from trunk to a branch) in the graph (the commits are there, they just don't link to branches in the graphical display of gitg
). In fact, for some branches I even see two identical commits one after the other (one for master, one for the branch).
The way I created the branches in SVN was using svn copy
.
Some more details:
Repository layout: A slightly simplified schematic of the SVN repo layout (the structure is the same, names are different, some directories have been omitted)
pkg
Project1
Project2
Lib
branches
Project1-feature1
Project1
Lib
Project1-hotfix
Project1
Lib
Lib-feature
tags
Project1
v0.1.0
v0.2.0
Project1
Lib
Project2
v0.1.0
The Lib
directory is closely associated with Project1, but also used by others. That is why I (starting with v0.2.0) created to Project1
and Lib
subdirectory structure in the branches and tags.
My git svn
workflow: This is the most promising command I used to clone the SVN repo:
git svn clone \
--prefix=svn/ \
--trunk=pkg \
--branches=branches \
--tags=tags/Project1 \
-A authors.txt \
--ignore-paths='^pkg/(?!Project1|Lib)' \
svn+ssh://user@svn.r-forge.r-project.org/svnroot/MyTool SVN2GitMigration
The --ignore-paths
option is there so that I keep only the two directories (Project
and Lib
) in which I'm interested. I do not filter on branches since there is only one branch not directly related to Project1
.
After that I convert the remote branches to local branches (and remove the remote branches), then convert the tags to proper Git tags.
EDIT START: Closer inspection of the commits reveals that I have many empty commits. These turn out to be due to the --ignore-paths
option: the empty commits are done in parts of the directory tree that are ignored. So this option doesn't really behave as I expected.
Back to the drawing board...
EDIT END
EDIT2
Actually, using git filter-branch --tag-name-filter cat --prune-empty -- --all
I managed to remove the empty commits
EDIT2 END
Possible cause of my merge problems: Branches/Tags are not single SVN commits because they first consist of a commit in which I create the branches/Project1-featureX
directory, followed by two svn copy
lines in which I copy the Project1
and Lib
directories from trunk.
Suggestions on how to properly convert this SVN repo are very welcome! If, somehow this means loosing Lib
that isn't a big deal. I'm planning to separate the two anyway once the migration has finished.
After a lot of trial and error I solved my problem in the following way:
First I initialised a repository without any branches or tags:
git svn init \
--prefix=svn/ \
--trunk=pkg/Project1 \
svn+ssh://user@svn.r-forge.r-project.org/svnroot/MyTool \
SVN2GitMigration
Next I added the author information:
cd SVN2GitMigration
git config svn.authorsfile ../authors.txt
After this, my .git/config
file had the following contents:
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
[svn-remote "svn"]
url = svn+ssh://user@svn.r-forge.r-project.org/svnroot/MyTool \
fetch = pkg/Project1:refs/remotes/svn/trunk
[svn]
authorsfile = ../authors.txt
In order to get the branches and tags I changed that file to:
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
[svn-remote "svn"]
url = svn+ssh://user@svn.r-forge.r-project.org/svnroot/MyTool \
fetch = pkg/Project1:refs/remotes/svn/trunk
tags = tags/Project1/{v0.4.2,v0.4.1,v0.4.0,v0.3.0,v0.2.2,v0.2.0}/Project1:refs/remotes/svn/tags/*
tags = tags/Project1/{v0.2.1,v0.1-9e,v0.1.3}:refs/remotes/svn/tags/*
branches = branches/{Project1-v0.4.2-fixes,Project1-v0.4.1-fixes,Project1-refactor,Project1-feature1}/Project1:refs/remotes/svn/*
branches = branches/{Project1-feature2}:refs/remotes/svn/*
[svn]
authorsfile = ../authors.txt
Notice how each branches
and tags
line has a list of directory names in {}
, even if it only contains one directory name. Without this, the fetching won't work.
To download and convert the SVN repository run:
git svn fetch
After this, some post-processing is required. To convert the remove tags and branches to proper local tags and branches and delete the remote ones run:
for branch in `git branch -r |grep -v tags| grep -v trunk | sed 's/svn\///'`; do
git branch $branch remotes/svn/$branch;
done
for tag in `git branch -r |grep tags| sed 's;svn/tags/;;'`; do
git tag $tag remotes/svn/tags/$tag;
done
for br in `git branch -r`; do
git branch -d -r $br
done
Convert the svn:ignore
properties to a .gitignore
file
git svn show-ignore > .gitignore
git add .gitignore
git commit -m "Added .gitignore file based on the svn:ignore properties"
After inspecting the git repo with gitg
or gitk
it turned out that many merges were missing (not show in the graph), so I had to graft those by hand by adding the parent commit hashes to the .git/info/grafts
file (the file format is merge_hash parent1_hash parent2_hash
). Note that gitk
shows the grafts, whereas gitg
doesn't until they are made permanent.
To make the commits permanent use
git filter-branch --tag-name-filter cat -- --all
and to remove the backups created by git filter-branch
run:
git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d
Now that everything is converted, clone the repository into a bare one:
git clone --bare SVN2GitMigration Project1.git
and push that to Github:
cd Project1.git
git push --mirror https://github.com/mygithubuser/Project1.git
Thanks to the following sites for pointing to the right directions: