Search code examples
javascriptregexregex-lookaroundslookbehind

Regex Lookbehind in Javascript Alternatives


I'm trying to use the following regex in JS:

(?<=@[A-Z|a-z]+,)\s|(?<=@[A-Z|a-z]+,\s[A-Z|a-z]+)\s(?=\[[A-Z|a-z]+\])

which translates to:

match all spaces which are preceded by :

  • @
  • followed by any number of characters in the range A-Z or a-z
  • followed by a comma

OR

match all spaces which are preceded by:

  • @

  • followed by any number of characters in the range A-Z or a-z

  • followed by a comma
  • followed by a space
  • followed by any number of characters in the range A-Z or a-z

AND are succeeded by:

  • [
  • followed by any number of characters in the range A-Z or a-z
  • ]

However, JS doesn't support lookbehind. Is there any alternative for supporting the above regex in JS or any npm library I can use instead?

So if we have a sentence like
Hi my name is @John, Doe [Example] and I am happy to be here that should become
Hi my name is @John,Doe[Example] and I am happy to be here.
Also, if we have something like
Hi my name is @John, Smith Doe [Example], that should become
Hi my name is @John,SmithDoe[Example].


Solution

  • I've updated my answer on new input

    console.clear();
    
    var inputEl = document.querySelector('#input')
    var outputEl = document.querySelector('#output')
    
    function rep (e) {
      var input = e.target.value;
      var reg = /@([a-z]+?\s*?)+,(\s+[a-z]+)+(\s\[[a-z]+\])?/gim
    
    
    
      matches = input.match(reg);
      var output = input;
    
      if (matches) {
        replaceMap = new Map()
        for (var i = 0; i < matches.length; i++) {
          var m = matches[i]
            .replace(/\[/, '\\[')
            .replace(/\]/, '\\]')
          replaceMap.set(m, matches[i].replace(/\s+/gm, ''))
        }
        for (var [s,r] of replaceMap) {
          output = output.replace(new RegExp(s, 'gm'), r) 
        }
      }
    
      outputEl.textContent = output
    }
    
    inputEl.addEventListener('input', rep)
    inputEl.dispatchEvent(new Event('input'))
    textarea {
      width: 100%; 
      min-height: 100px;
    }
    <h3>Input</h3>
    <textarea id="input">@Lopez de la Cerda, Antonio Gabriel Hugo David [Author]. I'm the father of @Marquez, Maria</textarea>
    <h3>Output (initially empty)</h3>
    <p id="output"></p>
    <h3>Expected result (on initial input)</h3>
    <p>@LopezdelaCerda,AntonioGabrielHugoDavid[Author]. I'm the father of @Marquez,Maria</p>

    Backup of old answer content (for historical reasons)

    It works at least in Chrome with this regex:

    /(?<=@[a-z]+,)\s+(?![a-z]+\s+\[[a-z]+\])|(?<=(@[a-z]+,\s[a-z]+))\s+(?=\[[a-z]+\])/gmi
    

    See: https://regex101.com/r/elTkRe/4

    But you can't use it in PCRE because it is not allowed to have quantifiers in lookbehinds. They must be of fixed width. See the errors to the right here: https://regex101.com/r/ZC3XmX/2

    Solution without look behinds and look aheads

    console.clear();
    
    var reg = /(@[A-Za-z]+,\s[A-Za-z]+)(\s+)(\[[A-Za-z]+\])|(@[A-Z|a-z]+,)(\s+)/gm
    
    var probes = [
      '@gotAMatch,     <<<',
      '@LongerWithMatch,        <<<',
      '@MatchHereAsWell,    <<<',
      '@Yup,         <<<<',
      '@noMatchInThisLine,<<<<<',
      '@match, match    [match]<<<<<<<',
      '@    noMatchInThisLine,    <<<<'
    ]
    
    for (var i in probes) {
      console.log(probes[i].replace(reg, '$1$3$4'))
    }
    .as-console-wrapper { max-height: 100% !important; top: 0; }