I am trying to match strings like 2h30m, 24h, 1d20h30s and extract every segment with a named group. Sounds easy and doable and it is if you use PCRE with a regex like the following:
^(((?<h>[0-9]+)h)|((?<m>[0-9]+)m))+$
Full example on regexr, you can switch between PCRE and Javascript engine on the top right corner.
The thing is it does not work in javascript and I can't figure why. My guess is it has something to do with the interaction between the OR operator and the named groups since, when using javascript, it only returns one of the named groups
The question is why? and if there is any way to make this work in javascript
It seems |
should be replaced with ?
to capture several groups in a row (with |
, only one last group in a line can be captured). And we need to add a non-empty assertion to prevent empty string match. Also, it is worth to insert some ?:
to make some groups non-capturing.
'use strict';
const str = `
2h30m
24h
1d20h30s
`;
const re = /^(?=.+)(?:(?:(?<d>[0-9]+)d)?(?:(?<h>[0-9]+)h)?(?:(?<m>[0-9]+)m)?(?:(?<s>[0-9]+)s)?(?:(?<ms>[0-9]+)ms)?)+$/gm;
let result;
while (result = re.exec(str)) console.log(result.groups);
If your Node.js (or browser) supports new matchAll()
method, this can be also achieved this way:
'use strict';
const str = `
2h30m
24h
1d20h30s
`;
const re = /^(?=.+)(?:(?:(?<d>[0-9]+)d)?(?:(?<h>[0-9]+)h)?(?:(?<m>[0-9]+)m)?(?:(?<s>[0-9]+)s)?(?:(?<ms>[0-9]+)ms)?)+$/gm;
console.log(Array.from(str.matchAll(re), ({ groups }) => groups));