Background:
I am migrating to Git a large SVN repository with 40.000 revision and 20+ GB of data.
I have got my repository fetched from SVN by running git svn fetch
with the following .git/config settings:
[svn-remote "svn"]
ignore-paths = ^[^/]+/(?:branches|tags) <--- note ignoring tags and branches
url = https://svn_server/repos/my_repo
fetch = :refs/remotes/git-svn
As it can be noticed by the config settings above, branches
and tags
have been ignored as I just want to migrate the contents of trunk
. Git-svn fetch also retrieved branches
and tags
directories to keep merging history.
At this point the remotes/git-svn
branch contains:
repo/
--branches
--tags
--trunk
Goal:
What I want is to just have in my Git repository the contents of trunk
, removing branches and tags, and keeping only history of existing files as I have no need to revert back to any branch, and I don`t need to see or revert any deleted file.
My first attempt was to rewrite history removing branches folder with the following command:
git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch branches"
After about 48 hours running I killed the process. I know I have a big amount of data, but this amount of time seems unreasonable to me so I guess I was not in the right direction.
By keeping only history of existing files I believe I could reduce my repository size from 20 GB to less than 1 GB, and then be able to upload it to Github.
Question:
Is there a way to clone only trunk
contents to a new Git repository and keep only history of files in trunk
with no reference to removed files or removed branches?
Well, just clone the trunk and only the trunk:
git svn clone http://svn_server/repos/my_repo/trunk
Note that I point directly to trunk and do not use the -s
option.