Search code examples
gitsvnmigrationgit-svn

Migrating specific SVN branches to GIT (trunk migrated in 2018)


I have a SVN repo that sometime during 2018 was partially migrated to GIT (Bitbucket). Both the SVN and GIT repos are in use. On SVN side mostly branches for older projects. The GIT repo was used for trunk/master development. Now I'm being asked to migrate the remaining branches from SVN to GIT. The problem is I don't know how the original conversion was done exactly(the person who have done this left the company). I can easily get the SVN repo converted to GIT but I can never get it in a state where the commit ids actually match. Currently the approach that gets me the closest seems to be

git svn clone -T trunk URL --no-minimize-url --no-metadata -r1:10 --preserve-empty-dirs 

I'm using the -r on only a few revisions so it's faster. The author actually gets same email (name@UUID) and the UUID matches the git repo converted in 2018. The problem seems to be that the SVN history starts with 2 commits that contain only folders and commit message - these commits are being ignored for me but somehow for the older migration they resulted in a git commit with just the message.

Example:

svn

commit A: create dir 1

commit B: create dir 2

commit C: create some files in dir 1 and 2

original git migration result:

commit A: commit message only

commit B: commit message only

commit C: create some files in dir 1 and 2

my current efforts using git svn:

commit C: create some files in dir 1 and 2

attempt using SubGit:

subgit import --trunk trunk --username user --svn-url URL)
commit A: commit message only (modified with notes)

commit B: commit message only (modified with notes)

commit C: create some files in dir 1 and 2 (commit message modified with notes)

notes on SubGit - this approach is pretty close (based on the documentation i'm pretty sure i can get the commit message fixed). The problem for me is that if I do not provide the authors file instead of ending up with 'user@uuid' I end up with 'user@localdomain'. Possible workaround would be to provide export the authors from the original migration and provide this as authors file ?

Anyone got any suggestions on how to do this ? If i cannot get to the same commit ids on git side is there some sane way of "merging" two git repos if I can find the common ancestor (even though id's don't match) ?


Solution

  • This is default SubGit's behavior -- if it's not given an author mapping (or in case if no match for an SVN username is found) then it generates a Git identity using the SVN user name and the default domain (which is set by core.defaultDomain SubGit configuration setting or default-domain command option). So if you need SubGit to set certain Git user identity for a given commit, then it's indeed better to provide it with an authors mapping file. If can be done with the "authors-file" command option (which probably you know, but still :))

    subgit import --trunk trunk --username user --authors-file <AUTHORS_FILE_PATH> --svn-url URL
    

    I didn't get completely what is wrong with the commit messages, assume that the commit messages in Git does not match those in SVN? If yes, then this can also be with SubGit, yet not with subgit import one-liner, it would require import with preliminary configuration and the configuration file editing. So, first run the following command to prepare a Git repository for import:

    subgit configure --snv-url URL <GIT_REPO>
    

    where GIT_REPO is a path to the new Git repository for the import. After the repository is prepared, edit the GIT_REPO/subgit/config file, set core.defaultDomain and core.authorsFile if needed, set correct mapping in [svn] section and the configure desired commit message with the svn.gitCommitMessage setting, here are a little more details about this setting:

    https://subgit.com/documentation/config-options.html#svn.gitCommitMessage

    after the configuration file is set, the import can be started with short command:

    subgit import GIT_REPO
    

    As for the Git commit notes -- SubGit always creates the notes, but they don't affect commits SHA1, so no actions are needed for the notes. Note, by the way, that authors and commit messages are not the only setting that may affect commits SHA1, there are also settings like 'svn.excludePath` or 'translate.createEmptyGitCommits'.