Search code examples
javascriptregexstring-conversionabsolute-value

Regex/Javascript, convert strings with absolute values |x| to abs(x)



I'm trying to convert some math related strings containing absolute values, using Regex in Javascript. I would like to convert all occurences of |foo| to abs(foo).

How can I detect if the character is opening or closing, given that they could also be nested? Basically I would like to convert all occurrences of opening | to abs( and all closing | to ). Whatever is between the vertical bars is unchanged.

Some examples of possible input and desired output:
|x|+12
abs(x)+12

|x|+12+|x+2|
abs(x)+12+abs(x+2)

|x|+|x+|z||
abs(x)+abs(x+abs(z))

Any ideas?


Solution

  • There are regex dialects that support nesting, JavaScript is not one of them. You can however do this in steps:

    1. tag the |s with nesting level (+1, -1, as you go from left to right)
    2. identify start and end | of same level from left to right based on tags, and from lowest level to highest level
    3. clean up left over tags in case of unbalanced input

    Functional code with test cases up to 3 levels (the code works to any level) :

    function fixAbs(str) {
      const startTag = '{{s%L%}}';
      const endTag   = '{{e%L%}}';
      const absRegex = /\{\{s(\d+)\}\}(.*?)\{\{e\1\}\}/g;
      let level = 0;
      str = str
      .replace(/ /g, '')  // remove all spaces
      .replace(/(\|*)?(\w+)(\|*)?/g, function(m, c1, c2, c3) {
        // regex matches variables with all leading and trailing `|`s
        let s = c2;
        if(c1) {
          // add a start tag to each leading `|`: `{{s0}}`, `{{s1}}`, ...
          // and post-increase level
          s = '';
          for(let i = 0; i < c1.length; i++) {
            s += startTag.replace(/%L%/, level++);
          }
          s += c2;
        }
        if(c3) {
          // decrease level,
          // and add a end tag to each trailing `|`: `{{e2}}`, `{{e1}}`, ...
          for(let i = 0; i < c3.length; i++) {
            s += endTag.replace(/%L%/, --level);
          }
        }
        return s;
      });
      // find matching start and end tag from left to right,
      // repeat for each level
      while(str.match(absRegex)) {
        str = str.replace(absRegex, function(m, c1, c2, c3) {
          return 'abs(' + c2 + ')';
        });
      }
      // clean up tags in case of unbalanced input
      str = str.replace(/\{\{[se]-?\d+\}\}/g, '|'); 
      return str;
    }
    
    const testCases = [
      '|x|+12',
      '|x|+|y+|z||',
      '|x|+||y|+z|',
      '|x|+|x+|y|+z|',
      '|x|+|x+|y+|t||+z|',
      '|x|+12+|2+x|',
      '|x|+12+|x+2|'
    ].forEach(str => {
      let result = fixAbs(str);
      console.log('"' + str + '" ==> "' + result + '"');
    });

    Output:

    "|x|+12" ==> "abs(x)+12"
    "|x|+|y+|z||" ==> "abs(x)+abs(y+abs(z))"
    "|x|+||y|+z|" ==> "abs(x)+abs(abs(y)+z)"
    "|x|+|x+|y|+z|" ==> "abs(x)+abs(x+abs(y)+z)"
    "|x|+|x+|y+|t||+z|" ==> "abs(x)+abs(x+abs(y+abs(t))+z)"
    "|x|+12+|2+x|" ==> "abs(x)+12+abs(2+x)"
    "|x|+12+|x+2|" ==> "abs(x)+12+abs(x+2)"
    

    Code is annotated with comments for clarity.

    This is based on a TWiki blog at https://twiki.org/cgi-bin/view/Blog/BlogEntry201109x3