Search code examples
javaregexquantifiers

Regex quantifiers and character classes


There are examples and descriptions of regex quantifiers in Java Tutorial.

Greedy - eats full string then back off by one character and try again

Regex: .*foo  // greedy 
String to search: xfooxxxxxxfoo
Found "xfooxxxxxxfoo"

Reluctant - start at the beginning then eat one character at a time

Regex: .*?foo  // reluctant quantifier
String to search: xfooxxxxxxfoo
Found "xfoo", "xxxxxxfoo"

Possessive - eats the whole string trying once for match

Regex: .*+foo // possessive quantifier
String to search: xfooxxxxxxfoo
No match found

They are ok and I understand them, but can someone explain to me what happens when regex is changed to the character class? Are there any other rules?

Regex: [fx]*
String to search: xfooxxxxxxfoo
Found "xf","","","xxxxxxf","","","",""

Regex: [fx]*?
String to search: xfooxxxxxxfoo
Found 15 zero-length matches

Regex: [fx]*+
String to search: xfooxxxxxxfoo
Found "xf","","","xxxxxxf","","","",""

Solution

  • It applies the quantifier (greedy, reluctant/lazy, possessive) to the entire character class. This means it will match (greedily, lazily, etc) each literal character in the character class.

    Regex: [fx]*
    String to search: xfooxxxxxxfoo
    Found "xf","","","xxxxxxf","","","",""
    

    So it looks for zero or more of f or x. The engine finds xf which matches. It also matches on the empty string around the two o's. It then matches the consecutive x's because it's zero or more of f or x.

    I would check out regex101.com for more detail on regexes, especially the debugger portion on the left sidebar