Search code examples
javascriptregex

JavaScript regular expressions and sub-matches


Why do Javascript sub-matches stop working when the g modifier is set?

var text = 'test test test test';

var result = text.match(/t(e)(s)t/);
// Result: ["test", "e", "s"]

The above works fine, result[1] is "e" and result[2] is "s".

var result = text.match(/t(e)(s)t/g);
// Result: ["test", "test", "test", "test"]

The above ignores my capturing groups. Is the following the only valid solution?

var result = text.match(/test/g);
for (var i in result) {
    console.log(result[i].match(/t(e)(s)t/));
}
/* Result:
["test", "e", "s"]
["test", "e", "s"]
["test", "e", "s"]
["test", "e", "s"]
*/

EDIT:

I am back again to happily tell you that 10 years later you can now do this (.matchAll has been added to the spec)

let result = [...text.matchAll(/t(e)(s)t/g)];

Solution

  • .matchAll has already been added to a few browsers.

    In modern javascript we can now accomplish this by just doing the following.

    let result = [...text.matchAll(/t(e)(s)t/g)];
    

    .matchAll spec

    .matchAll docs

    I now maintain an isomorphic javascript library that helps with a lot of this type of string parsing. You can check it out here: string-saw. It assists in making .matchAll easier to use when using named capture groups.

    An example would be

    saw(text).matchAll(/t(e)(s)t/g)
    

    Which outputs a more user-friendly array of matches, and if you want to get fancy you can throw in named capture groups and get an array of objects.