Search code examples
javaregexcapturing-group

Regex to match a number or nothing


i need to get a regex that can match something like this :

1234 <CIRCLE> 12 12 12 </CIRCLE>

1234 <RECTANGLE> 12 12 12 12 </RECTANGLE>

i've come around to write this regex :

(\\d+?) <([A-Z]+?)> (\\d+?) (\\d+?) (\\d+?) (\\d*)? (</[A-Z]+?>)

It works fine for when i'm trying to match the rectangle, but it doesn't work for the circle

the problem is my fifth group is not capturing though it should be ??


Solution

  • That is because only (\\d*)? part is optional, but spaces before and after it are mandatory, so you end up requiring two spaces at end, if last (\\d*) would not be found. Try maybe with something like

    (\\d+?) <([A-Z]+?)> (:?(\\d+?) ){3,4}(</[A-Z]+?>)
    

    Oh, and if you want to make sure that closing tag is same as opening one you can use group references like \\1 will represent match from first group. So maybe update your regex to something like

    (\\d+?) <([A-Z]+?)> (:?(\\d+?) ){3,4}(</\\2>)
    //        ^^^^^^^-----------------------^^^ 
    //        group 2                       here value need to match one from group 2