Search code examples
gitsvnversion-controlgit-svn

Importing multi-project SVN repository does not import all files created with svn copy


We're trying to move a legacy svn repository containing multiple projects into multiple git repositories.

Our repository setup is similar to this: https://daneomatic.com/2010/11/01/svn-to-multiple-git-repos/ , and following the suggestions in that post - while outdated - leads to properly split up repositories.

The actual process of splitting up the projects into different repositories is okay, but upon performing the svn repository import into git, several files and folders were not present after the import completed.

I tracked it down to files and folders where the svn log showed that they had been svn copied from other locations in the svn tree, and no matter what flags I experiment with when running git svn clone (such as specifying trunk, using filter-branch and similar) it seems like git-svn isn't able to resolve the svn cp, resulting in missing files and folders after the import has completed.

My theory is that the git svn import fails to "resolve" the svn copy. Even when doing a full import of the entire repository, the files in question are still missing, and comparing to a regular svn checkout I can pinpoint several missing files and folders.

Does anyone have any experience with this, flags I can use on git svn, or tool suggestions to complete an import like this? I'd like to complete the import keeping the history somewhat intact.


Solution

  • For a one-time migration git-svn is not the right tool for conversions of repositories or parts of repositories. It is a great tool if you want to use Git as frontend for an existing SVN server, but for one-time conversions you should not use git-svn, but svn2git which is much more suited for this use-case.

    There are plenty tools called svn2git, the probably best one is the KDE one from https://github.com/svn-all-fast-export/svn2git. I strongly recommend using that svn2git tool. It is the best I know available out there and it is very flexible in what you can do with its rules files.

    You will be easily able to configure svn2gits rule file to produce the result you want from your current SVN layout, including any complex histories like yours that might exist and including producing several Git repos out of one SVN repo or combining different SVN repos into one Git repo cleanly in one run if you like.

    If you are not 100% about the history of your repository, svneverever from http://blog.hartwork.org/?p=763 is a great tool to investigate the history of an SVN repository when migrating it to Git.


    Even though git-svn is easier to start with, here are some further reasons why using the KDE svn2git instead of git-svn is superior, besides its flexibility:

    • the history is rebuilt much better and cleaner by svn2git (if the correct one is used), this is especially the case for more complex histories with branches and merges and so on
    • the tags are real tags and not branches in Git
    • with git-svn the tags contain an extra empty commit which also makes them not part of the branches, so a normal fetch will not get them until you give --tags to the command as by default only tags pointing to fetched branches are fetched also. With the proper svn2git tags are where they belong
    • if you changed layout in SVN you can easily configure this with svn2git, with git-svn you will loose history eventually
    • with svn2git you can also split one SVN repository into multiple Git repositories easily
    • or combine multiple SVN repositories in the same SVN root into one Git repository easily
    • the conversion is a gazillion times faster with the correct svn2git than with git-svn

    You see, there are many reasons why git-svn is worse and the KDE svn2git is superior. :-)