Search code examples
gitsvngit-svn

Git svn clone: Is it possible to resume after error Malformed XML: no element found?


I'm attempting to do a one-way from subversion to Git migration of a large subversion repository using git svn with the following command (IMPORTANT after the migration only Git will be used):

git svn clone --no-minimize-url --trunk=/trunk/GBI --branches=/branches/GBI --tags=/tags/GBI --authors-file=authors.txt https://yyy/svn-repos/zzz/ GBI

After couple of hours running, the clone process crashes with the following error:

r79791 = 00349b8063f90447ea8a040751cd2a40e74b74f3 (refs/remotes/origin/trunk)
Error from SVN, (175009): Malformed network data: The XML response contains invalid XML: Malformed XML: no element found

Then I thought that maybe there is a clever way to resume the process right after the offending revision ... is that possible?

Any ideas what causes this error in the first place?

The answer to this question suggests using the --log-window-size to prevent this issue from happening in the first place ... can I add the option and retry from the failed revision? is this issue then a git svn memory usage issue or a problem that is connected solely to a corrupted subversion revision?

Is there a git svn option to robustify the process to ignore errors and not just stop the lengthy process due to this error?

UPDATE: I arrived at this point following the Atlassian Stash Migrating to Git guide which indicates using git svn and their svn-migration-scripts.jar implementation


Solution

  • git-svn is not the right tool for one-time conversions of repositories or repository parts. It is a great tool if you want to use Git as frontend for an existing SVN server, but for one-time conversions you should not use git-svn, but svn2git which is much more suited for this use-case.

    There are pleny tools called svn2git, the probably best one is the KDE one from https://github.com/svn-all-fast-export/svn2git. I strongly recommend using that svn2git tool. It is the best I know available out there and it is very flexible in what you can do with its rules files.

    If you are not 100% about the history of your repository, svneverever from http://blog.hartwork.org/?p=763 is a great tool to investigate the history of an SVN repository when migrating it to Git.


    Even though git-svn is easier to start with, here are some further reasons why using the KDE svn2git instead of git-svn is superior, besides its flexibility:

    • the history is rebuilt much better and cleaner by svn2git (if the correct one is used), this is especially the case for more complex histories with branches and merges and so on
    • the tags are real tags and not branches in Git
    • with git-svn the tags contain an extra empty commit which also makes them not part of the branches, so a normal fetch will not get them until you give --tags to the command as by default only tags pointing to fetched branches are fetched also. With the proper svn2git tags are where they belong
    • if you changed layout in SVN you can easily configure this with svn2git, with git-svn you will loose history eventually
    • with svn2git you can also split one SVN repository into multiple Git repositories easily
    • or combine multiple SVN repositories in the same SVN root into one Git repository easily
    • the conversion is a gazillion times faster with the correct svn2git than with git-svn

    There are many reasons why git-svn is worse and the KDE svn2git is superior. :-)