I was trying to find a regex that matches any string! and after some search I found almost all the answers says that [\s\S]
will match any string as said here or .*
as said here
But while playing a bit with PHP preg_match
I found that an empty regex is matching any string!
if(preg_match("//u", "")) echo "empty string matchs\n";
else echo "empty string does not match\n";
if(preg_match("//u", "abc")) echo "abc matchs\n";
else echo "abc does not match\n";
if(preg_match("//u", "\n")) echo "new line matchs\n";
else echo "new line does not match\n";
if(preg_match("//u", "/")) echo "/ matchs\n";
else echo "/ does not match\n";
exit;
this will output
empty string matchs
abc matchs
new line matchs
/ matchs
live demo (https://eval.in/845001)
Can I use this empty regex safely to match anything ? and what does an empty regex mean ?
If you are asking why would I need a regex that matches anything, that is because I'm using a function that requires a regex parameter as part of it's string validation functionality and I want it to accept anything.
An empty regex pattern //
matches at start, end and any position between characters in a string. See this demo at eval.in preg_match_all('//', "foo", $out);
which returns 4 empty matches:
Array[0] => [1] => [2] => [3] => )
As preg_match
would just check for the first match it should be fine to use the empty pattern. However generally I'd probably prefer /^/
which matches start of the string that every string has.
[\s\S]
(shorts for whitespaces together with non-whitespaces in a character class) means just any character and is usually used line-break related to also match newlines where there is no flag available for making the dot match linebreaks. Often used with JS regex which does not support s
flag. Similar are [\D\d]
(digits and non-digits), [\w\W]
(word characters and non word characters). Also possible with JS regex is [^]
a negated empty character class for "not nothing".
To use /[\s\S]/
or one of the others without quantifier will require at least one character.
Further to mention that in your patterns you use the u
flag for unicode regex. There is probably no reason to use this flag together with an empty pattern or just checking for start of the string. Interesting with pcre unicode regex might be the following escape sequences.
\X
matches an unicode grapheme. Similar the dot in u
-mode (but not newlines).\C
matches one data unit (similar using the dot without u
flag on unicode input).Well, I don't really see why one would need a pattern to match any string, but wrote for interest :)