There is a file named foo.js
in the current folder.
I use find to search :
tigerlei::~/work $ ll foo.js
-rw-rw-r-- 1 tigerlei tigerlei 187 Mar 29 2017 foo.js
tigerlei::~/work $ find . -regex '.*/foo.*.j[R-T]+' -regextype egrep
./foo.js
tigerlei::~/work $ find . -regex '.*/foo.*.j[RST]+' -regextype egrep
tigerlei::~/work $ find . -iregex '.*/foo.*.j[RST]+' -regextype egrep
./foo.js
My system is ubuntu 14.04.
findutil's version is 4.4.2
When I use -regex
, find will use case sensitive mode. But:
[R-T]
will match the lowercase letter 's', and [RST]
will not match 's'. Question
Why are those outcomes the results of my searches?
You need to set LC_ALL=C
to ensure the characters that form the range in the bracket expression go in the same order as in the ASCII table.
See this thread:
If you mean to match a letter in the user's language, use
grep '[[:alpha:]]'
and don't modifyLC_ALL
. But if you want to match thea-zA-Z
ASCII characters, you need eitherLC_ALL=C grep '[[:alpha:]]'
orLC_ALL=C grep '[a-zA-Z]'
.[a-z]
matches the characters that sort aftera
and beforez
(though with many APIs it's more complicated than that). In other locales, you generally don't know what those are. For instance some locales ignore case for sorting so[a-z]
in some APIs like bash patterns, could include[B-Z]
or[A-Y]
. In many UTF-8 locales (includingen_US.UTF-8
on most systems),[a-z]
will include the latin letters froma
toy
with diacritics but not those ofz
(sincez
sorts before them)...