Search code examples
regexgitgitignore

Explain gitignore pattern matching


I have the following directory tree:

> #pwd is the repo   
> tree -a
.
├── .git
│   |.....
├── .gitignore
├── README.md
├── f1.html
├── f2.html ... and some more html
├── images
│   └── river.jpg
>

I also have the following in my .gitignore:

> cat .gitignore
*
!*.html
!images/*.*
>

I would like all files in the images directory to be included in the repo. But that is not happening. I got it to work using the following in gitignore:

*
!*.html
!images*
!*.jp*g

What is happening here? Is there a foolproof way to test gitignore. I checked the documentation. Here is the point it don't understand (this is under pattern format heading):

Otherwise, Git treats the pattern as a shell glob suitable for consumption by fnmatch(3) with the FNM_PATHNAME flag: wildcards in the pattern will not match a / in the pathname. For example, "Documentation/*.html" matches "Documentation/git.html" but not "Documentation/ppc/ppc.html" or "tools/perf/Documentation/perf.html".


Solution

  • Firstly the tricky part in your question is the first line in the .gitignore file:

    *  // Says exclude each and every file in the repository,
       // unless I specify with ! pattern explicitly to consider it
    

    First we will consider the first version of your .gitignore.

    1. * exclude every file in the repository.
    2. !*.html allow all html files.
    3. !images/*.* consider all types of file in images folder.

    To include all JPG/JPEG you could have simply added !*.jp*g at 3rd line, which would have made git to consider all jpg and jpeg irrespective of any folder where that file is. But you specifically wanted only from images folder and not only jpg, any type of file in the images folder. Let's read some documentation related to it and in 3rd section we will go to solution part.


    Git ignore pattern regarding the folder consideration:

    1. Pattern ending only with slash: If a pattern ends with <dir-name>/ then git will ignore the files contained in that directory and all other sub-directories. As example given in the docs

      foo/ will match a directory foo and paths underneath it, but will not match a regular file or a symbolic link foo

      but also note, if any pattern matches a file in the excluded directory, git doesn’t consider it.

    2. Pattern does not have slash: If you are specifying the dir name in the ignore list which does not end with a slash, git will consider it as just a pattern, which can match any file having that pathname.

      If the pattern does not contain a slash /, Git treats it as a shell glob pattern and checks for a match against the pathname relative to the location

    3. Pattern with slash and special character (*/?) : If the pattern ends like the 1st example you gave, images/*.* It works as specified in the documentation

      Example: "Documentation/*.html" matches "Documentation/git.html" but not "Documentation/ppc/ppc.html" or "tools/perf/Documentation/perf.html".


    Solution

    Considering 3rd point git should consider all the files in the images directory for !images/*.* pattern. But it is not doing that because the documentation says one more important point

    Git doesn’t list excluded directories

    Because of the first line * the "images" directory itself is ignored. So first we should tell the git to consider images directory and later additional lines explicitly to say consider the other types (if needed).

    *
    !*.html
    !images/                 // <- consider images folder
    !images/*.*
    

    Note : the last line considers all types of files only from images directory not from any of its sub-directories. (3rd point in section 2)