Search code examples
regexlazy-evaluationgreedyregex-greedynon-greedy

Logic behind lazy regex using '?'?


here is my question:

For example if your pattern is:

abc?

Then this will match: ab abc but not abd as c? means: if there is a c, match, if not, no worries..

So say you have something like this:

->sometext<-->somemoretext<-

if you have a pattern like this: ( which is greedy ) ->.*<- then it will only match:

->sometext<-->sometext<-

however if your pattern is lazy: ->.*?<- then it will match: ->sometext<- AND ->sometext<-...

If, ? means, something like wheter/not ( as in the first example ), then what is the logic behind the second example, can someone explain? Why does it stop in ->sometext-< if the pattern is .*?


Solution

  • ? when placed after a * or + or ? makes it lazy. As in, it will try to match 0 characters, then 1 character if that failed, then 2 if that failed... as opposed to matching MAX characters, then MAX-1 if that failed, then MAX-2 if that failed... which is the behaviour by default - 'greedy' and wanting to match as much as possible.