I have a Git repository which contains a number of subdirectories. Now I have found that one of the subdirectories is unrelated to the other and should be detached to a separate repository.
How can I do this while keeping the history of the files within the subdirectory?
I guess I could make a clone and remove the unwanted parts of each clone, but I suppose this would give me the complete tree when checking out an older revision etc. This might be acceptable, but I would prefer to be able to pretend that the two repositories doesn't have a shared history.
Just to make it clear, I have the following structure:
XYZ/
.git/
XY1/
ABC/
XY2/
But I would like this instead:
XYZ/
.git/
XY1/
XY2/
ABC/
.git/
ABC/
Update: This process is so common, that the git team made it much simpler with a new tool, git subtree
. See here: Detach (move) subdirectory into separate Git repository
You want to clone your repository and then use git filter-branch
to mark everything but the subdirectory you want in your new repo to be garbage-collected.
To clone your local repository:
git clone /XYZ /ABC
(Note: the repository will be cloned using hard-links, but that is not a problem since the hard-linked files will not be modified in themselves - new ones will be created.)
Now, let us preserve the interesting branches which we want to rewrite as well, and then remove the origin to avoid pushing there and to make sure that old commits will not be referenced by the origin:
cd /ABC
for i in branch1 br2 br3; do git branch -t $i origin/$i; done
git remote rm origin
or for all remote branches:
cd /ABC
for i in $(git branch -r | sed "s/.*origin\///"); do git branch -t $i origin/$i; done
git remote rm origin
Now you might want to also remove tags which have no relation with the subproject; you can also do that later, but you might need to prune your repo again. I did not do so and got a WARNING: Ref 'refs/tags/v0.1' is unchanged
for all tags (since they were all unrelated to the subproject); additionally, after removing such tags more space will be reclaimed. Apparently git filter-branch
should be able to rewrite other tags, but I could not verify this. If you want to remove all tags, use git tag -l | xargs git tag -d
.
Then use filter-branch and reset to exclude the other files, so they can be pruned. Let's also add --tag-name-filter cat --prune-empty
to remove empty commits and to rewrite tags (note that this will have to strip their signature):
git filter-branch --tag-name-filter cat --prune-empty --subdirectory-filter ABC -- --all
or alternatively, to only rewrite the HEAD branch and ignore tags and other branches:
git filter-branch --tag-name-filter cat --prune-empty --subdirectory-filter ABC HEAD
Then delete the backup reflogs so the space can be truly reclaimed (although now the operation is destructive)
git reset --hard
git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d
git reflog expire --expire=now --all
git gc --aggressive --prune=now
and now you have a local git repository of the ABC sub-directory with all its history preserved.
Note: For most uses, git filter-branch
should indeed have the added parameter -- --all
. Yes that's really --space-- all
. This needs to be the last parameters for the command. As Matli discovered, this keeps the project branches and tags included in the new repo.
Edit: various suggestions from comments below were incorporated to make sure, for instance, that the repository is actually shrunk (which was not always the case before).