I've recently been reading up a bit on .gitattributes and also found places like this one, https://github.com/alexkaratarakis/gitattributes, where they try to maintain gitattributes for all file types. However in my mind, looking through those files, I instinctively think this is an unmaintainable mess. It means you'd have to update that file any time you use any new file extension, or any software brings out a new file extension, which is just impossible. When you're working with a team of 30+ people it's just a nightmare to maintain some file like that, we can barely maintain a simple icons.svg file.
But along with that I have been coding and using git for many years, on many different projects, and I've never used .gitattributes. We use things like prettier on our project which rewrites newlines to "lf" and we have devs on windows and things like this never gives any issues, vscode also never gives any issues with things like this. Git also automatically picks up binary files like pngs and automatically shows text differences for files like svg, I've never had to configure that.
So I ask the question, is it really necessary to have this file? Because it seems to me like it's signing up for a ton of maintenance that's completely unnecessary and that git is smart enough to figure out what it should or shouldn't do with a file.
is it really necessary to have this file?
Yes, for any setting (eol, diff, merge filters, content filters, ...) related to Git you want any collaborator to the repository to follow.
This differs from git config
which, for security reason, remains local (both because it can include sensitive information, or dangerous directives)
A .gitattributes
is part of your versioned source code, and contribute to establishing a common Git standard.
For instance, I always put (as in VonC/gitcred/.gitattributes
):
*.bat text eol=crlf
*.go text eol=lf
Because no matter how your IDE/editor is configured, I need CRLF for my Windows bat script to properly run, and I prefer LF for Go files, which I edit on Windows or Linux. I always considered local settings like core.autocrlf
an antipattern, best left to false
.
But a .gitattributes
can declare many other Git elements:
working-tree-encoding
, used for translation filesident
to embed file SHA1 as in herefilter
, most notably used by Git LFS, as in here, and I used it many times before.diff
, at least to avoid diffing binary files, or defining an external diff driverxfunname
for instance): I mention them here.unityyamlmerge
whitespace
to define what diff
and apply
should consider whitespace errorsThe .gitattributes
file is not "mandatory", but a useful tool in the Git toolbox, one that can be shared safely in a project code base.
And you can read it even in bare repositories:
With Git 2.43 (Q4 2023), the attribute subsystem learned to honor attr.tree
configuration that specifies which tree to read the .gitattributes
files from.
See commit 9f9c40c, commit 2386535 (13 Oct 2023) by John Cai (john-cai
).
(Merged by Junio C Hamano -- gitster
-- in commit 26dd307, 30 Oct 2023)
attr
: read attributes from HEAD when bare repoSigned-off-by: John Cai
The motivation for 44451a2 (
attr
: teach , 2023-05-06, Git v2.41.0-rc1 -- merge) (attr: teach "--attr-source=<tree>
" global option to "git
" , 2023-05-06), was to make it possible to usegitattributes
with bare repositories.To make it easier to read
gitattributes
in bare repositories however, let's just makeHEAD:.gitattributes
the default.
This is in line with how mailmap works, 8c473ce ("mailmap
: default mailmap.blob in bare repositories", 2012-12-13, Git v1.8.2-rc0 -- merge).
And, still with Git 2.43 (Q4 2023):
See commit 9f9c40c, commit 2386535 (13 Oct 2023) by John Cai (john-cai
).
(Merged by Junio C Hamano -- gitster
-- in commit 26dd307, 30 Oct 2023)
attr
: addattr.tree
for setting the treeish to read attributes fromSigned-off-by: John Cai
44451a2 (
attr
: teach , 2023-05-06, Git v2.41.0-rc1 -- merge) (attr: teach "--attr-source=" global option to "git
", 2023-05-06) provided the ability to pass in a treeish as the attr source.
In the context of serving Git repositories as bare repos like we do at GitLab however, it would be easier to point--attr-source
to HEAD for all commands by setting it once.Add a new config
attr.tree
that allows this.
git config
now includes in its man page:
attr.tree
A reference to a tree in the repository from which to read attributes, instead of the
.gitattributes
file in the working tree.In a bare repository, this defaults to
HEAD:.gitattributes
.If the value does not resolve to a valid tree object, an empty tree is used instead.
When theGIT_ATTR_SOURCE
environment variable or--attr-source
command line option are used, this configuration variable has no effect.
However, Git 2.46 (Q3 2024), batch 3 notes:
Git 2.43 started using the tree of HEAD as the source of attributes in a bare repository, which has severe performance implications.
For now, revert the change, without ripping out a more explicit support for theattr.tree
configuration variable.
See commit 51441e6 (03 May 2024) by Junio C Hamano (gitster
).
(Merged by Junio C Hamano -- gitster
-- in commit b077cf2, 13 May 2024)
51441e6460
:stop using HEAD for attributes in bare repository by default
With 2386535 ("
attr
: read attributes from HEAD when bare repo", 2023-10-13, Git v2.43.0-rc0 -- merge listed in batch #22), we started to use the HEAD tree as the default attribute source in a bare repository.
One argument for such a behaviour is that it would make things like "git archive
"(man) run in bare and non-bare repositories for the same commit consistent.
This changes was merged to Git 2.43 but without an explicit mention in its release notes.It turns out that this change destroys performance of shallowly cloning from a bare repository.
As the "server" installations are expected to be mostly bare, and "git pack-objects
"(man), which is the core of driving the other side of "git clone
"(man) andgit fetch
(man) wants to see if a path is set not to delta with blobs from other paths via the attribute system, the change forces the server side to traverse the tree of the HEAD commit needlessly to find if each and every paths the objects it sends out has the attribute that controls the deltification.
Given that (1) most projects do not configure such an attribute, and (2) it is dubious for the server side to honor such an end-user supplied attribute anyway, this was a poor choice of the default.To mitigate the current situation, let's revert the change that uses the tree of HEAD in a bare repository by default as the attribute source.
This will help most people who have been happy with the behaviour of Git 2.42 and before.Two things to note:
If you are stuck with versions of Git 2.43 or newer, that is older than the release this fix appears in, you can explicitly set the
attr.tree
configuration variable to point at an empty tree object, i.e.$ git config attr.tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
If you like the behaviour we are reverting, you can explicitly set the attr.tree configuration variable to HEAD, i.e.
$ git config attr.tree HEAD
The right fix for this is to optimize the code paths that allow accesses to attributes in tree objects, but that is a much more involved change and is left as a longer-term project, outside the scope of this "first step" fix.