Background
I have recently had a problem with a regular expression not working as expected in IE9. I tracked the issue down to a specific block inside the expression, namely [^].
var reg = /((?:abc.[^]*?)?test\s*(?:xyz)?\s*)[^]*?/;
The problem
var str = 'abc 123\nabc 123\nabc 123\ntest xyz';
var reg = /((?:abc.[^]*?)?test\s*(?:xyz)?\s*)[^]*?/;
alert(reg.exec(str));
In other words:
Input:
abc 123
abc 123
abc 123
test xyz
Output
Expected: ["abc 123\nabc 123\nabc 123\ntest xyz","abc 123\nabc 123\nabc 123\ntest xyz"]
Chrome: ["abc 123\nabc 123\nabc 123\ntest xyz","abc 123\nabc 123\nabc 123\ntest xyz"]
IE9: ["test xyz", "test xyz"] // Wrong!!!
Attempted solution
I found that the [^] block is causing the error. By simply switching [^] to [\S\s] I was able to attain the expected output in IE9.
var str = 'abc 123\nabc 123\nabc 123\ntest xyz';
var reg = /((?:abc.[\S\s]*?)?test\s*(?:xyz)?\s*)[\S\s]*?/;
alert(reg.exec(str));
Output
Expected: ["abc 123\nabc 123\nabc 123\ntest xyz","abc 123\nabc 123\nabc 123\ntest xyz"]
Chrome: ["abc 123\nabc 123\nabc 123\ntest xyz","abc 123\nabc 123\nabc 123\ntest xyz"]
IE9: ["abc 123\nabc 123\nabc 123\ntest xyz","abc 123\nabc 123\nabc 123\ntest xyz"]
Question
So what is the essential difference between [^] and [\S\s]? What is the problem here? Am I just dealing with an edge-case in the IE-javascript engine?
There is no difference between [^]
and [\s\S]
. [^]
exists in the Javascript specifications but IE9 doesn't handle it as many other Javascript features.
It seems that [^]
is AFAIK particular to Javascript. I have never seen it in an other regex flavour. In other flavours [^]
can be seen either as a syntax error or as an unclosed character class (in this case the closing bracket is not the end of the character class because it is immediately after the ^
and the class will eventually be closed at the next closing bracket if it exists).
Note that [^]
and []
are allowed since the first time regex features were added to the language (ECMA-262, 3rd edition December 1999).
In ECMA-262 third edition specifications (15.10.2.13), you can read that a negative character class is defined like this:
CharacterClass :: [^ ClassRanges ]
where ClassRanges
can be empty or not.
This definition is always the same in the 6th edition (June 2015).