Search code examples
gitsvngit-svn

git svn fetch repeatedly fails for no apparent reason


Background

We are planning to do a one-way migration of SVN to Git in Azure DevOps so that we can keep our commit history of messages. As you might expect we did a trial run, after much hair pulling and standing on the shoulders of other colleagues who came before we came up with the list of commands that finally worked after 26 hours of processing.

Those commands are:

Run in Git Bash to get list of all authors from SVN in Git format:

svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors-transform.txt

Run in Windows cmd shell as admin, note the copying done is because after many attempts these were created using git ignore commands and git lfs track commands so saved for reuse. Same goes for the git config file, where I tell it explicit SVN tags to process:

git svn init --prefix "" --no-metadata --trunk=Trunk --branches=Branches --tags=Tags https://jeeves/svn/ResourceDirectoryPortal/
git lfs install
copy ..\.gitattributes
copy ..\.gitignore
copy ..\authors-transform.txt
copy ..\config .\.git
git add .gitattributes
git add .gitignore
git commit -m "Preparation"
git svn fetch --log-window-size=2500 -A authors-transform.txt

Run in Git Bash to create Git tags:

for t in $(git for-each-ref --format='%(refname:short)' refs/remotes/tags); do git tag ${t/tags\//} $t && git branch -D -r $t; done

Run in Windows cmd via batch file a series of the following commands to get rid of incorrectly created tags as it was easier to do this than try and fix it and wait another 26 hours of migration time:

git tag -d NameOfTagHere

We then remove the "SVN" section from /.git/config and delete the /.git/svn folder. After which we run from a Windows cmd shell as admin:

git remote add origin https://AzureDevops.url.here/for-empty-git-repo
git config http.version HTTP/1.1
git push origin --all
git push origin --tags

Problem

Because we had some SQL Server compressed .BAK files in the repo, for restoring as part of build test, automated integration tests etc... the Azure Git repo took about 11 minutes to clone. Was hoping using Git LFS might make the impact of these files more acceptable, but decided to do another test where by:

  • **/*.bak was added to the .gitignore file
  • *.bak LFS tracking line was removed from the .gitattributes file
  • Ran the same set of commands as above but after a random amount of time, but typically in the 1 to 2 hour range, I get one of the following 3 errors that appear with different SVN revisions/files shown as the last line before they occur in the output:
ls-tree, command error 127
error closing pipe: Broken pipe at C:/Program Files/Git/mingw64/share/perl5/Git/IndexInfo.pm line 32. at /usr/lib/perl5/vendor_perl/SVN/Ra.pm line 623
rev-parse --git-path svn: command returned error: 127
config svn-remote.svn.tags-maxRev 2062: command returned error: 127

Theories Tested

  • The one thought I had was it might be the AzureDevops workaround I used for Git LFS to be able to work correctly with larger files, that is the following command: git config http.version HTTP/1.1

    I tried again without running that command and still got one of the above 2 errors.

  • Subsequent attempts I was just running the git svn fetch command from above in case it was just some transient issue - nope it wasn't!

  • Latest attempt I emptied the folder and started over, skipped the Azure devops line, it hasn't failed again, yet, but it's only been about an hour.

Stumped

I'm fairly noob-ish when it comes to using Git. I know that it's distributed and that is fundamentally different to SVN. I just don't understand how the same commands can randomly blow up without a more detailed error. The only difference in the SVN repo is that there have been 50-100 more commits

UPDATE:08/03/2021 Tried running the entire thing via Git Bash as opposed to just some, started getting the 3rd error reported now. I was clutching at straws here. I even tried reducing the log-window-size parameter down from 2500 to 1000 and then to 500. Still no joy.

UPDATE:09/03/2021 One of our sysadmins looked at the SVN logs and it seems it was being hammer by the same generic user from 2 different machines. I contacted the relevant people responsible for those machines and got one of those maachines stopped from what it was it was doing, so in theory cutting down about 50% of sustained traffic to the SVN server tried again and got error 4 that I added to the list of errors above. So it seems it isn't a timeout issue.


Solution

  • I don't 100% know if the following is true but git svn fetch is now completing without errors or simply stopping part way through.

    I examined the SVN logs from the server and at the times the errors were occurring the SVN log was 0.5GB in size by the end of the day, the majority of which was not my Git migration. I suspect that there was some form of timeout occurring and that git frustratingly simply had a unhelpful or no error message. When the git svn fetch command worked the SVN logs were down to 70MB by the end of the day.

    Thus the moral of the story it seems is to check your SVN logs because git isn't going to tell you anything useful.