Search code examples
linuxgitvimcarriage-return

How to fix changed source lines showing ^M with "git diff" but look fine with vim/gedit/cat -e?


I have searched high and low for anyone asking a similar question. It does not seem to be a simple case of :set fileformat=dos or :set fileformat=unix.

Writing the file out with :set fileencoding=latin1 and :set fileformat=dos changed such that git diff reports all the lines to have ^M appended.

The code was originally happily existing as:

...
if (v == value32S)
{
...

I made the outrageously radical improvement to (which looks fine on the screen in vim):

...
if (v == value32S ||
    v == value33)
{
...

But git diff to check for erroneous changes shows:

diff --git a/csettings.cpp b/csettings.cpp
index 1234..8901 100755
--- a/csettings.cpp
+++ b/csettings.cpp
@@ -2466,7 +2466,8 @@ bool MyClass::settingIsValid(QString s)

#if CONFIG_1 || CONFIG_2

-       if (v == value32S)
+       if (v == value32S ||^M
+           v == value33)^M
        {
              doSomething(new_v);

where the bold italic text is reverse video.

I have tried several means to make the apparently spurious carriage returns go away. First was to be sure there wasn't a hidden character. View with vim :set list:

...
if (v == value32S ||$
    v == value33)$
{$
...

Seems fine. Dumping the file (microdetails vary to protect NDA, and I am too lazy to make it a perfect deception):

$ hd csettings.cpp
(...)
0000eae0 xx xx xx xx xx xx xx xx  xx 65 33 32 53 20 7c 7c  |(v == value32S |||
0000eaf0 0d 0a 20 20 20 20 20 20  20 20 20 20 20 20 76 20  |..             v |
0000eb00 3d 3a 20 xx xx xx xx xx  xx 65 33 33 29 0d 0a 20  |== ...value33).. |

All of the other lines also end in "0d 0a", so this looks fine. An interesting suggestion was to use cat -e (which was new to me):

$ cat -e c.cpp

...
if (v == value32S ||^M$
    v == value33)^M$
{^M$
...

Another suggestion was to use file for clues:

$ file csettings.cpp
csettings.cpp:  C source, UTF-8 Unicode text, with CRLF line terminators

Interestingly, this is the only file in this directory (of header files and cpp code) which isn't ASCII text. Some files have CRLF line terminators and some do not. Also, some show C++ source and others are C source which I assume isn't significant.

Deleting the file and git checkout to get a fresh copy also shows it as UTF-8, which I traced to having the degree symbol in some strings ("°F" and "°C") so UTF-8 doesn't seem to be an issue.

Still, I don't see why using vim to edit only these lines would cause this problem. Or maybe it isn't a problem? Any ideas?

----- Addendum -----

git config --get-regexp core.* shows

core.repositoryformatversion 0
core.filemode true
core.bare false
core.logallrefupdates true

Solution

  • By default, Git assumes that you're using Unix line endings in the repository and highlights carriage returns as trailing whitespace. However, by default, it highlights trailing whitespace only on new lines, since the goal is to avoid introducing new problems.

    If you run git diff --ws-error-highlight=all, you'll see that there are also carriage returns on the lines being removed and on the context lines. If you don't want to see this, you can set core.whitespace to cr-at-eol, which will prevent it from being highlighted. There are no ill effects to this; it simply prevents carriage returns from being treated as trailing whitespace.

    If you're planning on using this project on non-Windows systems, you should convert the line endings to Unix and use a .gitattributes file to specify the text attribute for text files so the line ending is automatically converted based on the operating system in use. This may be valuable even if your project is only used on Windows, since if someone has core.autocrlf set, you may end up with mixed line endings.