I've found several purported solutions to this problem here on SO, but for some unknown reason none of them work for me.
I need to ignore everything in a given folder except for one particular file. Easy, right? Not so fast.
I've tried most every suggested answer for each of these questions:
...but I'm no further along than when I started.
Here's the path to the file to include:
D:\Projects\Website\Website\bin\Settings.json
The repo is at:
D:\Projects\Website
My .gitignore
file was generated by Visual Studio, so it contains this entry:
[Bb]in/
According to many of the answers to the questions above, I should be able to do something like this:
!/Website/[Bb]in/Settings.json
...but that doesn't work. The file is still ignored.
None of these permutations do the trick:
!*/Settings.json
!**/Settings.json
![Bb]in/Settings.json
![Bb]in/**/Settings.json
![Ww]ebsite/[Bb]in/Settings.json
!Website/bin/Settings.json
!/Website/bin/Settings.json
I've also tried putting a separate .gitignore
file in bin
:
# Don't block Settings.json
!Settings.json
!.gitignore
No luck.
How can I block everything in [Bb]in
except for the Settings.json
file?
Expected result:
Website\bin\Settings.json
is not ignored
Actual result:
Website\bin\Settings.json
continues to be ignored
Adding on to LeGEC's answer, which is fine, I note that you commented:
That works. It strikes me as a bit brittle (maybe that's just my imagination, and hopefully I'll be proven spectacularly wrong), but if this is the only way, I can live with it.
It's not the only way, and I have that same itchy feeling about it being brittle or otherwise somehow subtly wrong. It does work and it won't break in normal everyday use, but it just seems wrong to me to have files that are, and stay, tracked solely because they are tracked in the commits you extract, as you go about making new commits.
The trick here is that the Git path name Website/bin/Settings.json
results in a file that lives in a folder once extracted: the file Settings.json
is in the folder bin
(which in turn is in the folder Website
, but that's just adding on to the pile; one "in-the-folder" layer is enough here).
Note that to Git, Website/bin/Settings.json
is just a file name: that file name gets stored like that, with forward slashes, in Git's index (AKA staging area).1 The problem occurs later, when Git is scanning your working tree. The exclusion handling that Git does—using .git/info/exclude
and the various .gitignore
files—works via working tree files. It has to: it is all about untracked files, and the very definition of untracked file is a file that exists in your working tree, but not in Git's index.
When Git is comparing the current (HEAD
) commit's content—the set of stored files in the current commit, with all of their data—to the files in the index / staging-area, Git does not have to, and does not, look at your working tree at all. Everything Git needs is in the repository: the current commit is determined by reading HEAD
, which resolves to a commit hash ID, which resolves to an internal tree object, which obtains for Git all the file names and modes and their hash IDs. The proposed next commit, in the index / staging-area, contains the file names and modes and their hash IDs. The hash IDs let Git know if files are 100% matches or not, and for most purposes that's all we care about: git status
just prints an M
for modified, or the word modified
, without figuring out what actually changed, for instance.
Reading through the working tree, though: well, that's way harder. The OS gets in the way here. Sure, there may be a C library scandir
or readdir
function, or some other way to enumerate the contents of a folder. But Git still has to call lstat
on each name, perhaps.2 In any case, if you analyze timing results from why git status
took more than 20 nanoseconds, you find that it spends a lot of time just reading directories. Wouldn't it be nice if we could find some shortcut for this?
Enter .gitignore
and other exclusion files: if we read the top level work-tree and find directories named tmp
and zorg
, but those directories are ignored—via *
or */
or tmp
or tmp/
or whatever—why, then, we don't even have to open and read them at all! It won't matter whether ./tmp
contains one file, or one billion files: we'll skip the whole thing! Given that just opening and reading a directory to find its file names can take milliseconds—and using lstat
on each name can add many more—this is a huge savings.
So, Git does this. If Git is preparing a working-tree walk, and it is allowed to skip looking inside some folder / directory, it does skip looking inside that folder. Hence, if your .gitignore
file says:
*
then any directory name will match, and Git will skip opening, much less reading, the directory. This happens to your Website
folder.
If your .gitignore
reads:
*
!Website
though, when Git reads the top level directory and finds the name Website
, it can't ignore that. So Git opens the Website
folder and finds bin
, among other things. But: bin
does match *
and does not match Website
, so it's ignore-able. That means Git can skip right over it, never looking inside it. You'll need to add Website/bin
:
*
!Website
!Website/bin
Now Git has to open Website/bin
and read it. Every file and directory within it can be ignored, so to get Settings.json
within it to be not-ignored, we need to list that file:
*
!Website
!Website/bin
!Website/bin/Settings.json
This fairly-minimal .gitignore
file will work. It does, however, have one flaw. If there's a file or directory in bin
named Website
, that file or directory will be not-ignored. If not-ignored, Git will complain about it being untracked, or add it with git add .
, or other undesirable behaviors. To fix that, we should make sure that only Website
is matched, not, e.g., bin/Website
. This gets us to the second tricky part of Git's exclusion rules.
1The format for index entries is a bit messy and gets compressed, depending on index format version (of which there are several), but git ls-files --stage
will dump out the main stuff of interest, and there, you'll see the file named with embedded forward slashes. Git is, of course, capable of handling, and understanding, the backward slashes that Windows uses here, and hence stores the file in the bin
folder in the Website
directory.
Strings in Git's index are case-sensitive and are stored as UTF-8 or equivalent, regardless of how the file names are stored in the file system, and regardless of whether the file system's file names are case-insensitive.
2Some readdir
variants include a type field, DT_DIR for instance, that—if you can rely on it—let you skip this step sometimes; that can be a huge time-saver. I don't know if Git tries to do this: the working tree code has been revised multiple times, and now has all the complications from the fsmonitor code, which is a different way to speed things up, so I have not looked lately.
To understand this part properly, I like to borrow a concept from regular expressions: the idea of anchoring something to the left or right. In a regular expression like me*s
, we'll match ms pacman
and message
, but not memory
, because we're looking for m
, then any number of e
s, then s
, and memory
has no s
. But we'll also match acmestorage
because that has m
followed by one e
followed by s
, embedded within acme
and storage
(which run together). We can avoid some of this by anchoring the match at the left: ^m*s
won't match acmestorage
because the m
has to be the first letter.
(REs also let us anchor at the right with $
, typically. Each RE syntax has its own peculiarities, and .gitignore
files use glob syntax rather than RE syntax, so let's not get too far down this rabbit hole. Just remember the idea of anchoring: sticking a match to the left or right, or both. In Git's case, an anchored path is an exact match, stuck at both sides. That's because the right side is always anchored. You'd have to use path/*
or path/**
to allow arbitrary right-hand-side parts.)
In our case, with .gitignore
, we'd like to make sure that Website
only matches at the top level, where we put the .gitignore
file. To do that, we can start the entry with a leading slash:
*
!/Website
!Website/bin
!Website/bin/Settings.json
Now bin/Website
won't match the second line: the second line is anchored at the top (root) directory of the scan, and bin/Website
is not at that level: it's one level down.
You might think we should do that for all three file names:
*
!/Website
!/Website/bin
!/Website/bin/Settings.json
This works, but it's not necessary, and the reason is that a .gitignore
entry is automatically anchored if it has an embedded slash in it. Website/bin
has a slash in it that is not at either end, so it's automatically anchored. Website/bin/Settings.json
has two such slashes and is also anchored.
I implied there were only two tricky parts here. I lied. 😀 There's one more way that exclusion files uses slashes, which is unfortunately tricky, and that is that a final slash makes an entry match only a directory name. That is:
bin/
matches the bin
directory but not a file named bin
.
This rule is independent of the remaining rules:
!
negates the whole thing, so that !/Website/
means don't ignore./
(after any leading !
) or any embedded slash that's not at the end means "anchored, so that !/Website/
is anchored./
means only when it is a directory, so !/Website/
only matches a directory. The trailing slash doesn't count for anchoring purposes (and you should never use a double trailing slash) so if you want anchoring, be sure to include a leading or embedded slash.Using all of these rules, we come up with:
*
!/Website
!Website/bin
!Website/bin/Settings.json
which is complete and correct (provided I have the right upper and lower case here: remember that Git will be case-sensitive, regardless of your file system). But there's one other trick we can use that gives us a slightly shorter file. Suppose we write:
*
!*/
!Website/bin/Settings.json
Git will:
*
);!*/
);Website
directory, hence open and read it;Website/
, ignore it (*
);bin
and not ignore it (!*/
);Website/bin
directory;*
) except for Website/bin/Settings.json
.The downside to this three-line version is that, during the above processing, Git will open and read every directory, including every subdirectory of every directory, so if there is a top-level tmp
directory containing one billion files (directly or after recursing), Git will spend time checking every single one of them. That is, !*/
completely defeats the "don't bother looking here" optimization that saves so much time in some cases.
What would be nice is if Git's exclusion code were smart enough to realize that if you write:
*
!Website/bin/Settings.json
it should automatically register !/Website/
and !/Website/bin/
into its exclusion list if those aren't already present. This seems pretty straightforward to do. (Precisely how to do the negation and anchoring depends on the internal data structures here, which I have not looked at in more than ten years...)