I recently discovered that there are a couple folders in my solution that have two distinct paths in Git (GitHub shows two separate folders), one being FooBar
and the other being Foobar
. This is because some files were registered with the former folder name as their path, and some with the latter.
This was discovered locally (in Windows) by configuring Git to not ignore case: git config core.ignorecase false
I took a stab at fixing this by deleting the whole folder, committing, then re-adding the folder and committing again. This fixed the problem, but the files that got their paths changed lost their Git History. Running gitk
against the new path for these files showed just the one commit. Running gitk
against their old path revealed their whole history.
Next stab: Use git mv
to move the file:
git mv Foobar/file.txt FooBar/file.txt
This yields the error:
fatal: destination exists, source=Foobar/file.txt, destination=FooBar/file.txt
And if I try deleting the file first, of course Git complains that the source file doesn't exist.
Then I discovered Git doesn't complain about the destination already existing if you add -f
to the mv
command. However, after committing that rename, gitk
shows that the history got severed anyway!
I even attempted to do the three step dance described here but this was just another way of doing the -f
. Same result.
Basically I just want to move a file from Foobar/file.txt
to FooBar/file.txt
in a case-insensitive operating system in some way, while preserving Git history. Is this possible?
There is no simple solution to the real problem.
In Git, files don't have history. Commits have history—or more precisely, commits are the history. That is all the history there is. For Git to "follow" a file, as in git log --follow <path>
, Git looks at the commits, one at a time, comparing each commit to its parent commit.
If a diff between parent and child shows that the parent contains a file named parent/path/to/pfile
and the child contains a file named child/path/to/cfile
and the content of these two files, in these two commits, is "sufficiently similar" (several conditions must hold here), then, in Git's "eyes", that parent-to-child transition represents a rename of that file. So at that point, git log --follow
, which had been looking for child/path/to/cfile
, starts looking instead for parent/path/to/pfile
.
Without --follow
, git log
does not do this special "find a rename" operation ... and in general, Git believes that any path names with any byte-level difference represent different files. In other words, case-folding and UTF-8 normalization do not occur. Consider, e.g., the word schön
, which can be represented as either s
c
h
ö
n
or s
c
h
o
combining-¨
n
. We can, on a Linux box, create two different files using these two different UTF-8 style names. Running ls
will show two files whose name appears the same:
$ cat umlaut.py
import os
p1 = u'sch\N{latin small letter o with diaeresis}n'
p2 = u'scho\N{combining diaeresis}n'
os.close(os.open(p1.encode('utf8'), os.O_CREAT, 0o666))
os.close(os.open(p2.encode('utf8'), os.O_CREAT, 0o666))
$ python umlaut.py
$ ls
schön schön umlaut.py
Git is perfectly happy to store both files, separately. However, MacOS refuses to allow both files to coexist, in the same way that Windows—and for that matter, MacOS by default as well—refuses to allow both Foobar
and FooBar
to coexist.
Make Git store the file in new commits under the new byte-sequence, and history is preserved, it's just not the history you want preserved. But the history that's already in the repository is already not the history you want preserved.
In practice, you should probably just rename the file in Git's eyes—which has no effect on the file's name in your OS's eyes; FooBar
and Foobar
are the same name here—and get on with things. Your alternative is to rewrite all history going back in time to the point at which the bad pairings were first added to the repository, by copying (with slight modifications) each "bad" commit to a new-and-improved "good" commit. But this then means getting everyone who uses the repo to switch from "bad old repo" to "new and improved good repo".