Our repo (https://github.com/code-dot-org/code-dot-org) has recently converted to using Git LFS. We have large static web directories in Git, which admittedly isn't a great pattern. Our approach has been to include these large directories in LFS, and then exclude certain "known to be text" file types within them.
Here is an example of this approach in .gitattributes
:
pegasus/sites.v3/** filter=lfs diff=lfs merge=lfs -text # include dir in LFS
pegasus/sites.v3/**/*.haml !text -filter -merge -diff # exclude *.haml from LFS
However, git diff
isn't behaving as expected with the exclusions, say I modify pegasus/sites.v3/hourofcode.com/views/front_join_us.haml
. This is included by dir pattern, but then excluded by a more specific file extension pattern, however when trying to get a git diff
, it claims the file is binary and won't diff them:
diff --git a/pegasus/sites.v3/hourofcode.com/views/front_join_us.haml b/pegasus/sites.v3/hourofcode.com/views/front_join_us.haml
index aaf209c98be..2de745be618 100644
Binary files a/pegasus/sites.v3/hourofcode.com/views/front_join_us.haml and b/pegasus/sites.v3/hourofcode.com/views/front_join_us.haml differ
Additionally, it won't auto-merge the files. What's mysterious is that some tools, like VSCode, can properly diff this file.
I've confirmed that front_join_us.haml
is NOT tracked by LFS:
git lfs ls-files | grep front_join_us.haml # returns nothing
UPDATE: @LeGEC suggested using git check-attr -a -- <path>
to check the attributes on the file. Results were:
> git check-attr -a -- pegasus/sites.v3/hourofcode.com/views/front_join_us.haml
pegasus/sites.v3/hourofcode.com/views/front_join_us.haml: diff: unset
pegasus/sites.v3/hourofcode.com/views/front_join_us.haml: merge: unset
pegasus/sites.v3/hourofcode.com/views/front_join_us.haml: filter: unset
It appears the "exclude from LFS" gitattributes patterns generated by git lfs import
do not actually return the file to default attributes... how do I get filter, diff and merge back to their default values (so git check-attr reports not attributes) vs unset as they are now?
You pointed out in your comment that you suspected an issue with the attributes that applied to your files, to debug attribute issues, you can use git check-attr
:
git check-attr --all -- path/to/file
It turns out that setting the attributes this way:
# .gitattributes:
pegasus/sites.v3/**/*.haml !text -filter -merge -diff # exclude *.haml from LFS
leads to having all these attributes to "unset":
$ git check-attr -a -- pegasus/sites.v3/hourofcode.com/views/front_join_us.haml
pegasus/sites.v3/hourofcode.com/views/front_join_us.haml: diff: unset
pegasus/sites.v3/hourofcode.com/views/front_join_us.haml: merge: unset
pegasus/sites.v3/hourofcode.com/views/front_join_us.haml: filter: unset
which is not the same as having them "unspecified":
quoting git help gitattributes
Each attribute can be in one of these states for a given path:
[...]
Unset
The path has the attribute with special value "false"; this is specified by listing the name of the attribute prefixed with a dash - in the attribute list.
[...]
Unspecified
No pattern matches the path, and nothing says if the path has or does not have the attribute, the attribute for the path is said to be Unspecified.
having the diff
flag "unset" is what leads to the missing diff issue.
By changing the gitattributes rule to :
# .gitattributes: use '!' instead of '-'
pegasus/sites.v3/**/*.haml !text !filter !merge !diff # exclude *.haml from LFS
the attributes return to "unspecified" and the diff returns.