Search code examples
javascriptregexmatchingregex-group

How can I tokenize entire regex in JavaScript?


I'm trying to parse time strings and transcode them into an object I am gonna call time module. It is just a simple dictionary object with complete time disclosure.

The thing is that I have to match string consisting of a number and time unit. Currently I am trying to match this regex:
/^(([1-9][0-9]*)(y|m|w|d|h|min|s))+$/g.

I need it to yield every single match. So if I feed it this string: 12y12m12w12d12h12min12s - it should return something like this array:

[
    '12y12m12w12d12h12min12s',    // Matching string
    '12y',
    '12',
    'y',
    '12m',
    '12',
    'm',
    '12w',
    '12',
    'w',
    '12d',
    '12',
    'd',
    '12h',
    '12',
    'h',
    '12min',
    '12',
    'min',
    '12s',
    '12',
    's',
    index: 0,
    input: '12y12m12w12d12h12min12s',
    groups: undefined
]

Instead, it returns only the last unit:

[
    '12y12m12w12d12h12min12s',       
    '12s',
    '12',
    's',
    index: 0,
    input: '12y12m12w12d12h12min12s',
    groups: undefined
]

Can I do this thing using regex? How?


Solution

  • Capture groups only capture the last match.

    The new matchAll method due in ES2020 (and easily polyfilled) gets you quite close if you remove the anchors and flatten the result:

    const rex = /([1-9][0-9]*)(y|min|m|w|d|h|s)/g;
    const str = "12y12m12w12d12h12min12s";
    const array = [...str.matchAll(rex)].flat();
    console.log(array);
    

    That doesn't give you the overall whole string match (if you want it, insert in the array), but it gives you all the rest:

    Live Example:

    const rex = /([1-9][0-9]*)(y|min|m|w|d|h|s)/g;
    const str = "12y12m12w12d12h12min12s";
    const array = [...str.matchAll(rex)].flat();
    console.log(array);
    .as-console-wrapper {
        max-height: 100% !important;
    }

    If you don't want to use matchAll, you'll need a loop:

    const result = [];
    let match;
    while ((match = rex.exec(str)) !== null) {
        result.push(...match);
    }
    

    Live Example:

    const rex = /([1-9][0-9]*)(y|min|m|w|d|h|s)/g;
    const str = "12y12m12w12d12h12min12s";
    const result = [];
    let match;
    while ((match = rex.exec(str)) !== null) {
        result.push(...match);
    }
    console.log(result);
    .as-console-wrapper {
        max-height: 100% !important;
    }