Search code examples
gitgit-filter-branchgit-filter-repo

Extract a given set of files to a repository


I intend to extract a handful of files from a repository with ~10000 commits and 5000 files into a separate repository. Those files are spread across several directories, and there are unrelated files in those directories.

git filter-branch's --subdirectory-filter is not really an option, since it only accepts a single directory. Also, it doesn't follow renames, so part of the history, if the file was initially introduced in another directory, is lost.

My current solution is:

git filter-branch --tree-filter 'fd -E 3166 -X rm -rf'

Where fd is a faster find, and 3166 is a unique part of the file name of all extracted files. What it does it goes through all the commits, find all unrelated files and removes them. This is horribly slow, however, it takes hours.

Is there a better approach?


Solution

  • I guess you could test filter-repo (which is being recommended by git upstream to use instead of filter-branch). There you could specify more than one folder.... I guess you could provide all the directories that you care, both "current" and as you had them historically.

    https://github.com/newren/git-filter-repo