Search code examples
javascriptregexcapturing-group

Javascript Regex back-reference not populating all capturing groups


Strange one here (or maybe not), I am attempting to retrieve two capturing groups via Javascript regex, first group: one or more digits (0-9), second group: one or more word characters or hyphens (A-Z, 0-9, -) but for some reason I never can retrieve the latter group.

Please note: I have purposely included the alternation (|) character as I wish to potentially receive one or the other)

This is the code I am using:

var subject = '#/34/test-data'
var myregexp = /#\/(\d+)|\/([\w-]+)/;
var match = myregexp.exec(subject);
if (match != null && match.length > 1) {
  console.log(match[1]); // returns '34' successfully
  console.log(match[2]); // undefined? should return 'test-data'
}

Funny thing is Regex Buddy tells me I do have two capturing groups and actually highlights them correctly on the test phrase.

Is this a problem in my JavaScript syntax?


Solution

  • If you change:

    var myregexp = /#\/(\d+)|\/([\w-]+)/;
    

    by removing the | alternation meta-character to just:

    var myregexp = /#\/(\d+)\/([\w-]+)/;
    

    it will then match both groups. At present, your regex is looking for either \d+ or [\w-]+ so once it matches the first group it stops and the second will be empty. If you remove |, it's looking for \d+ followed by /, followed by [\w-]+ so it will always match either both or none.

    Edit: To match on all of #/34/test-data, #/test-data or #/34, you can use #(?:\/(\d+))?\/([\w-]+) instead.