Search code examples
gitsvnclone

How to git clone a single svn project that has been built from several svn projects


Question summary (details below)

How to clone a svn project that has a long history as a single project repository but that has been built from a merge of several projects that have their own history as well. Using svn tools, one can access the full length of a file history seamlessly without difference if the modifications happened before of after the projects fusion. This is what we tried to achieve after the migration to git.

Details

As others have discussed, we have moved from multiple svn projects to a single svn project several years ago. In other words, we went from something looking like this

svnrepo/
   frontend/
      trunk
      branches/
         ng/
         ...
      tags/
         1.x
         ...
   backend/
      trunk
      branches/
         ng/
         ...
      tags/
         1.x
         ...

to something looking like that

svnrepo/
    UnifiedProject/
        trunk/
            frontend/
            backend/
        branches/
        tags/

   frontend/
      trunk
      branches/
         ng/
         ...
      tags/
         1.x
         ...
   backend/
      trunk
      branches/
         ng/
         ...
      tags/
         1.x
         ...

Note that the original organisation still exists though all have files have been svn-moved to from svnrepo/submodule/trunk to svnrepo/UnifiedProject/trunk/submodule

This has been done preserving all history (including a previous migration from CVS to SVN) using some svn move commands at the repository level. Let's say this reorganization happened at date D.

The unified svn repository has accumulted a lot of hisotry since date D. Now we are trying to move from this svn unified repository to a single git repository. At first glance, that looks a lot simpler than this where they move from multiple svn to single git. But in that case, the unified version of the repository does't yet have an history itself.

Using a command similar to git svn clone http://svnrepo/UnifiedProject/ GitUnifiedProject worked well at first glance. All files, branches and tags have been retrieved.

However, having a closer look file history, we discover that all history before date D has been lost. The git svn clone command did not see the svn move from different projects of the same server.

Now the question is: how to retrieve the full history (even the one before date D) from each of the individual files that has been moved?

Another inspiration has been this link which makes a link between two version of the same repository. I tried this on a small part of the project. Let's say the frontend folder. Well, this frontend doesn't have the same layout as an old repository as it have now in the UnifiedProject new repository. It seems that a combination of the two links above might be somehow a good approach but I haven't figured out yet how to do this.

Maybe another approach would be to clone the whole svn repository directly at its root. That would be huge as I have oversimplified the structure above and other non-related projects are also in the repo. Let's say it is possible. Would it make the goal easier to achieve afterwards, maybe by deleting manually all history and projects that are not related to UnifiedProject (cf this and this)

Does anyone have experiment with a similar use-case?


Solution

  • I have found a workflow. First, using the technique described here, one could git svn clone each individual repositories and reorganized them individually so they look like they are expected in the unified repository. Something like this:

    git svn clone --stdlayout http://svnrepo/frontend frontend
    cd frontend
    mkdir frontend
    git mv !(frontend) frontend
    git commit -a -m “Moving frontend project into its own subdirectory”
    

    Lets say you have git init some StitchingHistory repository. Then within this repo, you can add all individual repo (eg: frontend). The important option here is --allow-unrelated-histories

    cd ../StitchingHistory
    git remote add frontend ../frontend
    git fetch frontend
    git checkout -b feature/merge-frontend #You can also do this directly on master by omitting this command
    git merge --allow-unrelated-histories frontend/master
    

    Once you have done this for all individual repositories, make sure everything is merged on the master branch. You should then have a master with the full history prior to date D that also happens to have the same directory structure as your repo post-date D.

    Having that in hand, you can use use cherry-picking like previously mentioned to glue the stitched historical repo just before the start of the UnifiedProject.