Even when setting core.autocrlf=true
we are still seeing line endings committed as CRLF instead of LF (I can see the ^M
symbol when running git diff
and git log -p
)
This is causing merge conflicts sometimes as different developers use different settings in their editors.
How can we fix this with minimal future conflicts in a very active repository environment?
I generally recommend using .gitattributes
here, rather than setting core.autocrlf
(not that I actually deal with this, but that's how the Git project folks do it, and presumably they know).
.gitattributes example:
*.ts text eol=lf
That's not going to fix your problems, but it should help avoid new problems in the future.
To handle merge issues, consider running:
git merge -X renormalize
and/or setting merge.renormalize
to true
.
It's worth pointing out that Git never does any conversions on commit. The mechanism that implements CRLF-to-LF-only conversion sits not in git commit
, but rather in git add
. To see why, we must start with a few Git basics:
The index is also called the staging area, which is a better name in terms of how users use it, although the ways Git uses it go beyond this (which is why it has the name cache as well, giving it three names: cache, index, and staging area). In any case, these extra "copies" of files exist, in Git's index. I put "copies" in quotes here because what's in the index is already in the special form Git uses. These are not ordinary files and cannot be edited either (they can be replaced though).
Instead, there's a third copy of every file. This third copy is an ordinary file. These are the files that you can see and edit. The catch is that these files are not in Git. They are extracted from Git during git checkout
or git switch
, or when using git reset
or git restore
for instance.
During extraction, Git has two options: it can leave the file alone, or it can change it. The change can include replacing LF-only line endings with CRLF line endings. You now have a file you can look at. If you choose to have files extracted such that they have CRLF line endings, you also have configured Git, at this point, to make "undo CRLF line endings" changes if and when you have Git replace the index copy.
What git add
does is tell Git: Make the index copy match the working tree copy. If you've changed the working tree copy Git will now compress and Git-ify the working tree copy and use that to replace the copy that is in the index. Unfortunately for CRLF-line-ending-fixing, if Git thinks that there's no need to replace the index copy, Git does nothing at all.1
Unfortunately, Git checks neither core.autocrlf
nor any settings in .gitattributes
when deciding whether or not it needs to replace the index copy of some file. So changing either of these does not count as a "change" to any file.2 In Git 2.16 and later, git add --renormalize
helps tell Git: Hey, you dummy, I changed my EOL conversions, so adding a file changes it even if I haven't changed it. In Git versions predating 2.16, you must trick Git into believing that you changed the file if all you're trying to do is fix the line endings. (There are a bunch of ways to do that, but let's just pretend you have git add --renormalize
. 😀)
Think of Git's index as holding the proposed next commit. When you run git commit
, Git simply snapshots the index. Since it already holds Git-ified files, this goes pretty fast.
In any case, the end result here is this:
At the time Git copies a file from Git's index—or with git restore
, from a commit—to your working tree, Git applies the "make the file useful to you, personally" end-of-line changes.
At the time Git copies a file from your working tree to Git's index, Git applies the "make the file normalized for the repository" changes.
When you run git commit
, Git uses the copy that is in Git's index to make the commit. So whatever line endings appear in the index copy, those are the line endings that go into the permanent copy in a new commit. You can't see those copies, but git ls-files
can.
1Actually re-compressing a file into the Git format is a lot of work, so Git cleverly avoids it whenever possible. That's one of the reasons that Git's index exists. Other version control systems get by without one; other version control systems are slower. The problem here is just that Git thinks that this work-avoidance is possible too often.
2Of course, if you modify .gitattributes
, Git will realize that .gitattributes
is modified. It's just that it never extends that to thinking: Oh hey! Maybe that means other files have EOL settings changed.
core.autocrlf
vs .gitattributes
There are a bunch of differences between using core.autocrlf
and using .gitattributes
to specify in-repository line-ending formats. The biggest and most obvious is of course that core.autocrlf
is a setting everyone has to make in their personal .git/config
or $HOME/.gitconfig
or wherever they like to put it, but .gitattributes
is a committed file.
Being a committed file has a bunch of ramifications: in particular, you get the existing one when you check out some commit. There's a copy of that file in each commit—well, each commit made when .gitattributes
was in Git's index at the time whoever made the commit, made the commit—and when you check out that commit, Git obeys that commit's .gitattributes
settings. When you git add
files, Git obeys your working tree's .gitattributes
settings, so you can change the settings and git add
files—including .gitattributes
—and your updated settings will apply and will go into the next commit.
Importantly, when listing files in .gitattributes
, you are in control. You can tell Git that file xyz
is a binary file, if it is binary. You can tell Git that file abc
is a text file, if it is text. You can say that *.js
or *.py
are text and that *.jpg
are binary. When you use core.autocrlf
, you're just having Git guess. Git can guess wrong, and do CRLF changes to your binary files, or not do them to your text files.
For details on what to put in a .gitattributes
file, see the gitattributes documentation.
When you use git merge
to do a three-way merge,3 there are three input commits: the current commit, the commit you select via the command line, and the merge base. You can normalize line endings in your current commit, by just making a new commit with the fixed line endings. If you really had to, you could check out the commit you are about to select on the command line, normalize its line endings, and commit that too, and use that one on the command line instead. But you literally can't fix the merge-base commit: Git finds this one on its own, based on the commit graph—the linkage between commits as stored in the commits' metadata.
Line endings matter to git diff
, and the merge depends on the two diffs (from the merge base to your commit, and from the merge base to the commit you select). So it is sometimes necessary to fix the merge base, as well as the current commit and the other commit. The renormalize option does exactly that. This way, there's no need to do the impossible—to fix historical commits' line endings. A merge that requires renormalizing goes a little slower, but that's better than not going at all.
3Remember that git merge
might instead do a fast-forward merge, which is not a merge at all.