Search code examples
gitgit-filter-branch

Detach many subdirectories into a new, separate Git repository


This question is based on Detach subdirectory into separate Git repository

Instead of detaching a single subdirectory, I want to detach a couple. For example, my current directory tree looks like this:

/apps
  /AAA
  /BBB
  /CCC
/libs
  /XXX
  /YYY
  /ZZZ

And I would like this instead:

/apps
  /AAA
/libs
  /XXX

The --subdirectory-filter argument to git filter-branch won't work because it gets rid of everything except for the given directory the first time it's run. I thought using the --index-filter argument for all unwanted files would work (albeit tedious), but if I try running it more than once, I get the following message:

Cannot create a new backup.
A previous backup already exists in refs/original/
Force overwriting the backup with -f

Any ideas? TIA


Solution

  • Answering my own question here... after a lot of trial and error.

    I managed to do this using a combination of git subtree and git-stitch-repo. These instructions are based on:

    First, I pulled out the directories I wanted to keep into their own separate repository:

    cd origRepo
    git subtree split -P apps/AAA -b aaa
    git subtree split -P libs/XXX -b xxx
    
    cd ..
    mkdir aaaRepo
    cd aaaRepo
    git init
    git fetch ../origRepo aaa
    git checkout -b master FETCH_HEAD
    
    cd ..
    mkdir xxxRepo
    cd xxxRepo
    git init
    git fetch ../origRepo xxx
    git checkout -b master FETCH_HEAD
    

    I then created a new empty repository, and imported/stitched the last two into it:

    cd ..
    mkdir newRepo
    cd newRepo
    git init
    git-stitch-repo ../aaaRepo:apps/AAA ../xxxRepo:libs/XXX | git fast-import
    

    This creates two branches, master-A and master-B, each holding the content of one of the stitched repos. To combine them and clean up:

    git checkout master-A
    git pull . master-B
    git checkout master
    git branch -d master-A 
    git branch -d master-B
    

    Now I'm not quite sure how/when this happens, but after the first checkout and the pull, the code magically merges into the master branch (any insight on what's going on here is appreciated!)

    Everything seems to have worked as expected, except that if I look through the newRepo commit history, there are duplicates when the changeset affected both apps/AAA and libs/XXX. If there is a way to remove duplicates, then it would be perfect.