Search code examples
javascriptnode.jsglob

NodeJS&Glob: any number of filenames, filename extensions and files that must be excluded by single string?


The following glob expression selects all .pug files except index.pug (filddle):

/src/!(index){.pug,.haml,index.haml}

Now assume that we has managed to stop haml support, therefore all that related with haml must be removed from above expression. The question is how to reduce above expression without radical changing of it's structure? This condition is important for creation of simple and the same time flexible algorithm of glob expressions generation.

Wrong solution (no matches among .pug files):

/src/!(index){.pug}

Below solution is also improper, because possibly in the future we would to add new filename extension support (e. g. .slim). To implement it, we must to radically change the algorithm that generates below expression.

/src/!(index).pug

Below solution is also improper because we can not exclude arbitrary files from it. What if we want to exclude index.pug and about.slim, but not index.slim and about.pug?

/src/!(index).+(pug|slim)

By other words: all above solution are not scale-able on any count of:

  • filename that must be selected
  • filename extensions which files must be selected
  • filename with certain extension that must be excluded.

Important: In this question, we does not consider files receiving by glob, globby, gulp.src() etc. We considering single string generation.

And also: if above problem is impossible to solve by one string, please write such as (with explanation/commentary).


Solution

  • Why is it not working?

    You've run into a peculiarity of the syntax that minimatch supports. (For readers who don't know. The site that the OP uses for illustration uses minimatch for evaluating globs.) You look at a pattern like this:

    {a,b,c}
    

    and learn that it matches the names a, b and c. A brace pattern contains a series of subpatterns separated by commas, and the brace pattern matches if any of the subpatterns matches. So the pattern means "if the text matches a OR b OR c, then it is a match". So you figure that

    {a,b}
    

    matches a and b. Or in prose "if the text matches a OR b, then it is a match." Going further, you'd think you can do this:

    {a}
    

    to mean "if it matches a, then it is a match." It would be equivalent to just having the pattern a. Without knowing any better, it is reasonable to think that {a} would mean this. In programming languages that allow or-ing lists of conditions, you can usually perform the or-ing operation on a list of just one element.

    However, minimatch does not work this way. In order for the braces to be recognized as expressing alternate choices, it must contain at least one comma. So the expression {a} will match the literal input {a} with the braces and everything.

    Workarounds

    You could modify the code that generates the brace pattern so that if the list of subpatterns has only one element, then you repeat it. To reuse the example you gave, you'd have .pug twice in the braces:

    /src/!(index){.pug,.pug}
    

    This makes it so that the braces are going to be interpreted as expressing alternatives instead of being interpreted literally. By repeating the same subpattern, the set of matched files does not change.

    Another solution would be to generate a brace pattern that always contains an element that cannot match anything. For instance:

    /src/!(index){.pug,!(*)}
    

    The last element cannot match anything so it does not add matches, but its presence is enough for make minimatch interpret the braces as you want.


    For the sake of providing some background: what minimatch does with the braces is called "brace expansion". When you give minimatch a pattern a{b,c}d, then before it does anything else, it converts it to two patterns: abc and acd and then it considers that there is a match if either pattern matches. Brace expansion is something that minimatch adopted from Unix shells along with the rest of the globbing syntax. (Its documentation tells you to see man sh, man bash, etc.)