Search code examples
regexregex-group

Regex: How to get coordinates without using sub-groups?


I've been trying to extract sets of coordinates in the following format:

[-34.0, 23, 0.555] , [3, 4, 5], ....

For the first set, I wish to extract "-34.0", "23", and "0.555". For the second set, "3", "4", and "5".

I've found a way to do so on stackoverflow and through my own experiments on https://regexr.com, but it implies that ".0" and ".555" will also be extracted as subgroups, which I do not wish for.

\[([-]?\d+(\.\d+)?),\s([-]?\d+(\.\d+)?),\s([-]?\d+(\.\d+)?)\]

subgroups

However, my initial alternatives are not working. Why are these not valid, and how to create a regex within my requirements?

a: Does not register the left bracket on [\d] as a special character and thus associates the right bracket to the [\. component's left bracket

\[([-]?\d+[\.[\d]+]?),\s([-]?\d+[\.[\d]+]?),\s([-]?\d+[\.[\d]+]?)\]

BracketNoCompute

b: Does not register the + sign as a special character

\[([-]?\d+[\.\d+]?),\s([-]?\d+[\.\d+]?),\s([-]?\d+[\.\d+]?)\]

PlusNoCompute

Thank you for your time!

Update:

I have now been made aware of the non-capturing group feature.

First of all - thank you! It did the job I needed.

Second of all - I'm still curious as to why the other options didn't work, so I'll leave this up for the next 24 hours or so, at least.

Update v2:

Questions fully answered. Thank you so much, everyone!


Solution

  • Your pattern does not match because \d+[\.[\d]+]? matches one or more digits \d+ followed by a character class [\.[\d]+ that repeats matching on of the listed characters and then an optional ]

    You could write the pattern using 3 capture groups, with opitional non capturing groups (?:...)?

    \[(-?\d+(?:\.\d+)?),\s(-?\d+(?:\.\d+)?),\s(-?\d+(?:\.\d+)?)]
    

    See a regex demo.

    Some notation notes:

    • [-]? ---> -?
    • [\.] ---> [.] or \.
    • \d+[\.[\d]+]? ---> I think you meant \d+[.\d]* where [.\d]* can also match only dots as the character class allows optional repeating of the listed characters.

    For the notation, see character classes