Search code examples
gitgithubvisual-studio-codenewlineeol

Change End Of Line git configuration


So right now I'm working in a proyect with another college and he uses mac and I use windows, so we are having trouble with the EOL (End Of Line) of our files. We decided, we wanted all files to be LF. For him that's no problem, because he is using a mac, but I must change all my files from CRLF to LF, which I have already done, also I have set a config that everytime when I add a new file it starts as LF. The problem is that when I want to commit changes to our repo, it gives me this warning.

The file will have its original line endings in your working directory warning: LF will be replaced by 
CRLF in tsconfig.build.json.

What I want to achieve? I want to disable this "automatic" replacement by github. I don't want my files to be replaced by CRLF, I want all my files to be LF at all times. Any suggestions?


Solution

  • First some important terminology notes: GitHub never replace anything. They literally can't. Don't worry about GitHub here; they are irrelevant. Do worry about your own Git, which—aside from starting with the same three letters—has very little to do with GitHub. GitHub are a hosting provider where you can store and access Git repositories. Git is software that implements a Version Control System.

    Your Git is where the problem is and/or should be solved. Well, your Git, and your friends's or colleague's Git, and so on, depending on what systems they use. Your file editors may also play roles, but Git can and will override them, if you set it up to do so.

    i want all my files to be LF at all times.

    You mention Visual Studio Code, which might also have its own method of messing about with your files. You can investigate this and configure it if it suits your purposes. This answer only concerns Git itself.

    What to set in .gitattributes

    To make sure that newly added or updated files get and see LF-only line endings, modify your existing .gitattributes files to list the file names or patterns you want affected and include:

    <pattern> text eol=lf
    

    The pattern part here can include things like *.json or *.sh or even just * (which matches all file names).

    If you don't have a .gitattributes file, just create one. Make sure it contains plain text, preferably simple ASCII or UTF-8 without byte-order-markers.

    What this * text eol=lf line means

    As noted above, the first part is a pattern: it's the set of files that the rest of the line applies to. You can list more than one pattern, or list the same pattern more than once, by including multiple lines. The last matching pattern will generally override earlier ones.1

    The text part tells Git that this file is text, i.e., edit-able stuff made up of lines, as opposed to binary files, that Git can't assume contains text. This enables end of line conversion on the file.2

    The eol=lf tells Git what the ends of lines should look like. The lf here means use line-feed endings, as opposed to the CRLF line endings that Windows programs often seem to prefer or require.

    The other value you can set for eol is eol=crlf, but given your statement above, you don't want that.


    1This is how it works in .gitignore as well. But, in .gitattributes files, this can get complicated, since each line can set different things. For instance, you can write:

    * text
    *.bin -text
    

    The -text overrides the text, so the last line takes precedence, but only for files whose name ends with .bin. But you could also write:

    * text zorg
    *.bin -text
    

    then you've set the zorg attribute for all files, including *.bin files. The -text unsets text for *.bin, but leaves zorg set.

    2Technically, you can leave the text part out entirely if you use the eol=lf part. Setting eol to some value implies setting text. The gitattributes documentation uses text eol=lf in an example, though, so that seems to be de rigueur.


    Don't use core.autocrlf

    Very old versions of Git used core.autocrlf, core.eol, and other similar core settings to do this sort of thing. Along with this, you can—but should not—use text=auto in a .gitattributes line. This tells Git to guess whether the file contains text, and therefore lines whose line-endings can be changed, or binary (and therefore should not have its line-endings messed with, as they merely coincidentally resemble line endings, but are actually precious binary data).

    The warning you're seeing has to do with this stuff. If Git plans to mess with some file, and the way Git plans to mess with the file looks a little suspicious to Git, Git warns you about it.

    The underlying mechanism

    There may come a time—or maybe it has already come—where you want to see what's in the repository itself, rather than the files you work with. You might wonder what this statement means. After all, isn't the repository the collection of the files you work with? But the answer is: no, it's not!

    Git is all about commits. What a Git repository contains and uses is, at least for the most part, commits. While a commit contains files, a commit is not, itself, a file, nor is it the files it contains. A commit is its own thing. It's something real, and it's a unit—well, mostly the unit—that your Git passes around to other Gits, and vice versa.

    Every commit has a unique number. That's not a unique number within your repository, but rather, a globally or universally unique number: a UUID or GUID. To make sure each ID is truly unique to each particular commit, Git assigns very long and random-looking hash IDs to the commits. These are, in effect, the true names of each commit. Your Git gets together with some other Git, and the two share commits by sharing these IDs. If your Git has some ID they don't, that means your Git has some commit they don't, and vice versa.

    This ID is in fact a cryptographic checksum of all the data that go into the commit. For this reason, no part of any commit can ever change, once the commit is made. This is why you need not worry about GitHub. You make the commits on your computer, and from then on, nothing—not even Git itself—can change them. They are completely, totally read-only. Any line endings inside any file inside any commit are that way forever—along with the rest of that line, and all the other lines, in that file. The files inside each commit are all frozen for all time.

    So, a commit contains files—that's its main data, really—along with some other metadata that we won't go into in this answer. The files in the commit are frozen for all time, along with the metadata. They're also stored in a de-duplicated format that only Git itself can read. These two factors make the files quite useless for getting any real work done, because:

    • we need to be able to read the files (in programs, in our editors, and so on), and
    • we need to be able to change the files, to do new work.

    What this means is that to use the files from some commit, Git must extract the files. When it does so, it puts the extracted files in a work area. Git calls this work area your working tree or work-tree. This is quite simple: it's where you do your work.

    The files you work on, in your working tree, are not Git's files. The committed files are inside some commit. Those can't be changed. Git's files are elsewhere. The work-tree files are yours, and are not inside Git at all.

    Because Git has to extract files before you can see them and use them, this is an ideal place to take files that have LF-only line endings, and turn them into CRLF-ended files for you to see and work on/with. If you choose to have Git mess with line endings, Git will try to store files with LF-only line endings at all times, and convert them to CRLF endings during the extraction process.

    Because Git has to compress files into its frozen format before it can put any new or changed file into a commit, this is an ideal place to take files that have CRLF line endings, and turn them into files that have LF-only line endings, before storing them inside a new commit. If you choose to have Git mess with line endings, Git will always convert the lines to LF-only lines when replacing a file's content with new content.

    The mechanism for this works on the whole file. When you use git add—which you must do if you've updated some file3—Git will, at that time, do the line ending conversion and compress the file into the frozen format, ready to be committed.4 Similarly, when you use git checkout to switch from one commit to another, Git has to remove from your work-tree any file that is different (or gone entirely!) in the new commit you're switching to, and replace it with the file taken out of that commit. It then expands out the committed file into a usable one in your work-tree, and while doing so, can replace LF-only line endings with CRLF line endings.

    The especially-tricky part of this is, as noted in footnote 4, that Git works hard not to bother with files that didn't change. This all assumes that whatever eol= setting might have applied before, still applies now. So, sometimes, when you change the eol= setting, you have to erase Git's index to invalidate it, or touch all your files in your work-tree, or use git add --renormalize, if your Git is new enough to have the "renormalize" option.

    What this boils down to in practice is that if you change the eol= setting, you may want to run git add --renormalize --all or similar. If you don't have it, there are some fairly ugly workarounds, but the best thing to do is probably to upgrade your Git version.


    3While you can use git commit -a or git commit --include and a list of file names, the way this works internally is, more or less, to run git add on those files. There's a lot to the more or less part but this answer won't go into those details.

    4The mechanism for this involves what Git calls, variously, the index, or the staging area, or—rarely these days—the cache. These three terms all refer to the same thing. The cache aspect of this thing tries to keep track of which files you might have actually changed in your work-tree, and which ones you definitely didn't change in your work-tree. This lets Git leave these files alone, as you change from commit to commit, speeding Git up. It also lets git add skip adding some files, speeding Git up.