Search code examples
javascriptregexlookbehind

How can I know how many matches get replaced in a string?


Let's say I have a function that looks like this:

function countReplacements ( string, search, replacement ) {
    string.replace ( search, replacement );
}

What's the cleanest, and most reliable, way of knowing how many matches get replaced into the string?

I've thought of the following possible solutions:

  • Wrap the replacement value with a Proxy that logs every time its proxied value gets accessed. This isn't transpilable down to older versions of JS though.

  • Reimplement the algorithm used in String.prototype.replace so that every time it replaces something it logs this. This isn't very clean at all.

  • If search is a string or a non-global regex I can check if string includes/matches it. But if search is a global regex, when JS will have support for lookbehinds I'm not sure this will work, maybe all matches are computed before actually replacing them? If this isn't the case any replacement may cause the following lookbehinds to no longer match, or to now match things that it wouldn't have matched in the original string.

What do you think is the best solution to the problem?


Solution

  • For plain cases when the replacement is a string, for the second argument to .replace, use a callback function rather than the plain replacement string, and have the callback increment a variable:

    function countReplacements(string, search, replacement) {
      let count = 0;
      const result = string.replace(search, () => {
        count++;
        return replacement;
      });
      return { count, result };
    }
    
    console.log(countReplacements('foobar', /o/g, 'a'));

    For the more complicated cases when replacement is a function or a string containing group references, you'll either have to re-implement String.prototype.replace on your own: use the parameters provided to .replace to get the full match and the groups:

    function countReplacements(string, search, replacement) {
      let count = 0;
      const result = string.replace(search, (match, ...groups) => {
        count++;
        return replacement
          .replace(/\$(\d+|&)/g, (_, indicator) => {
            if (indicator === '&') return match;
            if (/^\d+$/.test(indicator)) return groups[indicator - 1];
            // and so on for other `$`s
          });
      });
      return { count, result };
    }
    
    console.log(countReplacements ( 'foobar', /(o)/g, '$1_' ));

    A much lazier but easier to implement version would just be to call match and check the length of the result, though this will require going through the string with the regex twice:

    function countReplacements(string, search, replacement) {
      const match = string.match(search);
      const count = match ? match.length : 0;
      const result = string.replace(search, replacement);
      return { count, result };
    }
    
    console.log(countReplacements ( 'foobar', /(o)/g, '$1_' ));

    If search is a string or a non-global regex I can check if string includes/matches it. But if search is a global regex, when JS will have support for lookbehinds I'm not sure this will work, maybe all matches are computed before actually replacing them? If this isn't the case any replacement may cause the following lookbehinds to no longer match, or to now match things that it wouldn't have matched in the original string.

    This won't be a problem - the only issue with using .match to get the count in addition to .replace is that it requires going through the string twice. The string's replacements are all calculated at once, where the lookarounds are looking at the original string. Then, once all matches are found and the replacements are calculated, each matched substring is replaced with its replacement.