Search code examples
gitgit-svn

Keeping all history when moving from svn to git


Can all history be kept when moving from svn to git? I find that history prior to an svn copy is lost.

I have the following svn repository:

  • project1/trunk/A
  • project2/trunk/dir/B
  • project3/trunk
    • A - copied from project1
    • dir/B - copied from project2

If I git svn clone project3, there is no history for A and B from project1 and project2.

Here's a demonstration of the problem:

> svn co https://localhost/svn/test
> cd test
> mkdir -p project1/trunk project1/branches project1/tags
> mkdir -p project2/trunk/dir project2/branches project2/tags
> mkdir -p project3/trunk project3/branches project3/tags
> touch project1/trunk/A project2/trunk/dir/B
> svn add project1 project2 project3
> svn ci -m 'initial commit'
> svn copy project1/trunk/A project3/trunk/
> svn copy project2/trunk/dir project3/trunk/
> svn ci -m 'project restructure'

Running svn log for each shows both revisions:

> svn log project3/trunk/A 
 ------------------------------------------------------------------------
 r2 | tanderson | 2015-04-16 19:37:33 +1000 (Thu, 16 Apr 2015) | 1 line

 project restructure
 ------------------------------------------------------------------------
 r1 | tanderson | 2015-04-16 19:37:32 +1000 (Thu, 16 Apr 2015) | 1 line

 initial commit
 ------------------------------------------------------------------------
> svn log project3/trunk/dir/B 
 ------------------------------------------------------------------------
 r2 | tanderson | 2015-04-16 19:37:33 +1000 (Thu, 16 Apr 2015) | 1 line

 project restructure
 ------------------------------------------------------------------------
 r1 | tanderson | 2015-04-16 19:37:32 +1000 (Thu, 16 Apr 2015) | 1 line

 initial commit
 ------------------------------------------------------------------------

Now for the clone:

> git svn clone --stdlayout --follow-parent https://localhost/svn/test/project3 gittest
  Using higher level of URL: https://localhost/svn/test/project3 => https://localhost/svn/test
  r1 = 1bc0768d6d823b49305978d227df6834d2787fdc (refs/remotes/origin/trunk)
       A       A
       A       dir/B
  r2 = c71c15ec116a7ada952d8457d50902c970616ef5 (refs/remotes/origin/trunk)
  Checked out HEAD:
      https://localhost/svn/test/project3/trunk r2

I'm hoping to see both revisions of A and B, but in both cases only the final revision is shown. E.g.

> cd gittest
> git log --follow A
   commit c71c15ec116a7ada952d8457d50902c970616ef5
   Author: tanderson <tanderson@897fde24-c897-6841-ad7f-93f2e7295302>
   Date:   Thu Apr 16 09:37:33 2015 +0000

    project restructure

    git-svn-id: https://localhost/svn/test/project3/trunk@2 897fde24-c897-6841-ad7f-93f2e7295302

I've tried the following tools:


Solution

  • On the simple test repository above, svn-all-fast-export works.

    It's written in C++ and doesn't have binaries. I was able to build it using cygwin.

    To covert the sample svn repository above:

    1. copy the svn repository locally (using svnadmin hotcopy or svnsync)
    2. create an authors file, users.txt, as per https://git-scm.com/book/en/v2/Git-and-Other-Systems-Migrating-to-Git
    3. create a rules file, rules.txt to instruct svn-fast-all-export how to process repository paths:

      create repository testnew
      end repository
      
      match /project1/trunk/
          repository testnew
          branch project1
      end match
      
      match /project2/trunk/
          repository testnew
          branch project2
      end match
      
      match /project3/trunk/
          repository testnew
          branch master
      end match
      
    4. Run:

      svn-all-fast-export --identity-map=users.txt --rules=rules.txt --stats /path/to/repo/test.svn
      

    This will create a git repository, testnew in the current directory. The git log command shows both revisions for A and B. E.g.:

    > git log A
      commit eb561b48109620fc64f9f7cf15a752b1730f8d18
      Merge: e3a02bf e1cc2a9
      Author: tanderson <tanderson@localhost>
      Date:   Sun May 24 06:00:36 2015 +0000
    
         project restructure
    
      commit e3a02bff6a7ace21b3789c4f9350e969add44541
      Author: tanderson <tanderson@localhost>
      Date:   Sun May 24 06:00:35 2015 +0000
    
         initial commit