I tried really hard to make a good title, but I'm not sure if I'm asking this right. Here's my best attempt:
I'm using Python's flavor of regex
I need to match numbers using named groups:
15x20x30 -> 'values': [15,20,30]
15bits -> 'values': [15]
15 -> 'values': [15]
x15 -> 'values': [15]
but should not match:
456.48
888,12
6,4.8,4684.,6
my best attempt so far has been:
((?:[\sa-z])(?P<values>\d+)(?:[\sa-z]))
I'm using [\sa-z]
instead of a word-boundary because 15x20
are two different values.
But it fails to match both 15
and 20
for the 15x20
case. It does work if I put an extra space as in 15x 20
. How do I tell it to "reset" the non-capturing group at the end so it also works for the non-capturing group at the beginning?
You may use
(?<![^\sa-z])\d+(?![^\sa-z])
Case insensitive version:
(?i)(?<![^\sa-z])\d+(?![^\sa-z])
Or, compile the pattern with the re.I
/ re.IGNORECASE
flags.
See the regex demo
Details
(?<![^\sa-z])
- a negative lookbehind that fails the match if, immediately to the left, there is no whitespace or a lowercase letter (any ASCII letter if (?i)
or re.I
are used)\d+
- 1+ digits(?![^\sa-z])
- a negative lookahead that fails the match if, immediately to the right, there is no whitespace or a lowercase letter (any ASCII letter if (?i)
or re.I
are used)