So I have the following task, where I have to explain whether or not it's possible to construct a regular expression following these constraints:
My task is not to actually write an regular expression, but just to reasoning about the fact if it's possible or not.
My conclusion is that it's perfectly possible, but I would have to write down every possible string in my regex, which would be extremely long.
So far I have the following regular expression:
(([3-9]*[1]{n}[3-9]*[2]{n}[3-9]*){0,1000})
Where n is iterating towards it's maximum length of 1.000.000
I know that iterators is not a "thing" in the regular language. But is there an easier way to "iterate" n, instead of having to write it all by hand?
My conclusion is that it's perfectly possible, but I would have to write down every possible string in my regex, which would be extremely long.
For questions about theory, the fact that it would be extremely long is irrelevant. You're correct. Any finite language is regular.
I know that iterators is not a "thing" in the regular language. But is there an easier way to "iterate" n, instead of having to write it all by hand?
No, there isn't. Perhaps it would help to think about how to construct a state machine for this.
One state, the initial state, is the one where the number of 1
s seen and the number of 2
s seen are equal.
A second state is one where one more 1
has been seen than 2
s.
A third state is one where one more 2
has been seen than 1
s.
A fourth state is one where two more 1
s have been seen than 2
s. Etc...
It should be fairly straightforward to convince yourself that you cannot combine any of those states, as there is no overlap in the valid remaining substrings for each state. Therefore, you would need >1000000 states, and as the translation of a regular expression to a state machine is fairly direct, the complexity of a regular expression cannot be less than the complexity of the state machine.
Of course, when programming, you can implement this far more efficiently. You can for instance store the count of 1
s and 2
s in additional variables. But variables are a feature state machines do not have.