I was wondering if there might be some reason someone would want to use a regular expression for a problem that could also be written easily without using regular expressions.
I came to this thought because of this question.
The question is something fairly simple, and the answers vary in 2 categories, those who do solve it with regular expressions and those who just use some other simple operation.
Summary of the question:
Remove the first part of a url path (example: String path = "/folder1/folder2/folder3/"
).
2 Solutions:
//With regex
String newPathRegex = path.replaceAll("^/[^/]*", "");
//Without regex
String newPathNoRegex = path.substring(path.indexOf('/', 1));
Personally I think the no RegEx solution is a lot easier to read, but I'm not an expert on regular expressions.
So the question comes down to: Should you avoid using regular expressions in cases as simple as this one? Is there better performance in the RegEx solution?
A few reasons why it is useful to use regular expressions:
Regular expressions run in O(n log n) in the size of the expression and O(n) of the length of the string. So the time complexity is guaranteed to be very reasonable, whereas custom programs can sometimes be badly implemented. Most programs running in (pseudo)-linear time are considered to be very fast. Although it is possible to construct tailor made algorithm that will outperform regular expressions for each task that can be carried out by a regex, it is in general not easy for humans to do so. Regular expressions thus guarantee the construction of a fast enough algorithm.
Most properties on regular expressions are decidable: it is decidable whether two regular expressions determine the same set of strings, etc. So there is an entire algebra defined over it. All (non-trivial, language-invariant) properties on programs are undecidable: that's a consequence of Rice's theorem, so you can't prove in general that two programs will do the same thing (are equivalent), whereas this is an easy task for regular expressions.
Modifiable. Perhaps you want to remove the first part of the path, but only if it is not ..
. In general modifications to a regular expression tend to be easy whereas modifying a program can blow up the size of the code.
The most problematic part is that not all programmers are familiar with regular expressions, and that they are a bit cryptic: the semantics are sometimes a bit hard to guess. And furthermore, the pumping lemma states not every problem can be transformed into a regular expression (problem).