Search code examples
gitsvngit-svnsvn2git

Why do a lot of revisions get lost in git-svn?


I am currently trying to move a couple of projects from svn to git using git-svn. No matter which project I am trying to convert, there are many more svn revisions than git commits. I don't really know svn, so I have a hard time figuring out why that happens.

I have basically followed this guide for conversion. The loss on the way ranged from 18 revisions to 9 commits up to from 131 revisions to 10 commits in git. This happened in projects with many branches as well as ones without any branches.

So far I have tried using the option --stdlayout (Missing revisions after "git svn clone"). I have also tried using svn2git, however that failed as well, likely because my projects don't seem to have the infrastructure svn2git requires (format and db files are missing). I guess I will eventually find a way to do this by using any of the other tools available (like this) but I would just really like to know why this happens.

So: Does anyone know why so many revisions are not shown in the commit history when using git-svn for converting svn repositories to git? Is git-svn just buggy or are there some revision types which just aren't shown in a git commit history?

Update

I have since found out that svn log - contrary to git log - shows all revisions in the history, not just the ones to trunk/master. This means that many of the revisions I thought were missing, were actually just in branches. However, even so, not all revisions are in the commit history. The ones that are missing are those which are shown when calling svn log e.g. inside \branches (but not inside one of the branches). git-svn does probably not import them because they affect neither master nor any of the branches. While this is clear to me now, I'm still somewhat at a loss as to the significance of this. Are those revisions important or is the git history fine without them?

Update 2

The .git/config file

[core]
    repositoryformatversion = 0
    filemode = false
    bare = false
    logallrefupdates = true
    ignorecase = true
[svn-remote "svn"]
    noMetadata = 1
    url = svn://TheURL/TheRepository
    fetch = trunk:refs/remotes/svn/trunk
    branches = branches/*:refs/remotes/svn/*
    tags = tags/*:refs/remotes/svn/tags/*
[svn]
    authorsfile = /a/file/path/svn/authors-transform.txt
[remote "origin"]
    url = https://github.com/User/TheRepository.git
    fetch = +refs/heads/*:refs/remotes/origin/*

The file structure:

TheRepository
|
+--branches
|  |
|  +--branchA
|  |
|  +--Readme_branch.txt
|
+--trunk
|
+--tags

Solution

  • So today I learnt (with many thanks to eftshift0):

    1. svn log and git log show different things

    svn log will return all revisions done to any subdirectories you are currently in. So if you're in the root directory, it will show all changes made to all branches, the trunk and tags. This is unlike git log which only shows the commits done to the branch you're currently on. This might seem obvious if you know svn, but I didn't, so...

    2. What --stdlayout does

    Specifying --stdlayout means that git-svn will only incorporate files and revisions done either within trunk, branches or tags. Inside branches and tags it only picks up the branches and tags (that is, directories), so if for some reason there are additional files in branches (but not in any of the actual branches), these won't be in the git repository. This also means that revisions related to these files will not be turned into commits.

    These two facts were the reasons why the commits I had differed from the revisions. I hope this information will be useful to someone else one day :)