Is there any application for a regex split()
operation that could not be performed by a single match()
(or search()
, findall()
etc.) operation?
For example, instead of doing
subject.split('[|]')
you could get the same result with a call to
subject.findall('[^|]*')
And in nearly all regex engines (except .NET and JGSoft), split()
can't do some things like "split on |
unless they are escaped \|
" because you'd need to have unlimited repetition inside lookbehind.
So instead of having to do something quite unreadable like this (nested lookbehinds!)
splitArray = Regex.Split(subjectString, @"(?<=(?<!\\)(?:\\\\)*)\|");
you can simply do (even in JavaScript which doesn't support any kind of lookbehind)
result = subject.match(/(?:\\.|[^|])*/g);
This has led me to wondering: Is there anything at all that I can do in a split()
that's impossible to achieve with a single match()
/findall()
instead? I'm willing to bet there isn't, but I'm probably overlooking something.
(I'm defining "regex" in the modern, non-regular sense, i. e., using everything that modern regexes have at their disposal like backreferences and lookaround.)
The purpose of regular expressions is to describe the syntax of a language. These regular expressions can then be used to find strings that match the syntax of these languages. That’s it.
What you actually do with the matches, depends on your needs. If you’re looking for all matches, repeat the find process and collect the matches. If you want to split the string, repeat the find process and split the input string at the position the matches where found.
So basically, regular expression libraries can only do one thing: perform a search for a match. Anything else are just extensions.
A good example for this is JavaScript where there is RegExp.prototype.exec
that actually performs the match search. Any other method that accepts regular expression (e. g. RegExp.prototype.test
, String.prototype.match
, String.prototype.search
) just uses the basic functionality of RegExp.prototype.exec
somehow:
// pseudo-implementations
RegExp.prototype.test = function(str) {
return RegExp(this).exec(str);
};
String.prototype.match = function(pattern) {
return RegExp(pattern).exec(this);
};
String.prototype.search = function(pattern) {
return RegExp(pattern).exec(this).index;
};