Search code examples
javascriptjqueryregexlookbehind

Workaround for unsupported lookbehind in older browsers?


Because of lack of browser support, I need to change 2 lookbehind RegExp to something else.

I have these 2 patterns which includes a lookbehind:

  1. matches the pattern '(anything not a ')': (ie. 'abcdef':) if it is not at the beginning of the string or preceded by a ,

     /(?<!^|\,)\'[^\']+\'\:/g
    

    and

  2. matches the pattern '(anything not a ')', (ie. 'abcdef',) if it is not at the beginning of the string or preceded by a :

     /(?<!^|\:)\'[^\']+\'\,/g
    

I need to find a pattern for each that matches the same thing, but without the lookbehind.

Full code with lookbehind:

I have a user input which I then run through a series of .replace(), with a RegExp in each, to get it to match a certain format.

var str = "(user input)";
// expected format 1-(infinite) of '(something)':'(something)' ,-seperated
// ie. "'(something)':'(something)','(something)':'(something)','(something)':'(something)'"

// test str for this example (which is clearly not in the right format)
// str = "  '''aaaaaa¤¤¤ 'iiiiii''''mmmmmm:"bbbbbb''nnnnnn   'kkkkkk¤¤¤,'cccccc'jjjjjj¤¤¤'":'dddddd 'gggggg''hhhhhh',llllll''"'eeeeee¤¤¤  '':'ffffff '  "

// replace all " with ' (so I don't have to account for both in following RegExp below)
str = str.replace(/\"/g, "'");
// str = "  '''aaaaaa¤¤¤ 'iiiiii''''mmmmmm:'bbbbbb''nnnnnn   'kkkkkk¤¤¤,'cccccc'jjjjjj¤¤¤'':'dddddd 'gggggg''hhhhhh',llllll''''eeeeee¤¤¤  '':'ffffff '  "

// remove all illegal characters so even if the format doesn't match nothing bad can be done by the user
str = str.replace(/[^a-zA-Z0-9 \-\/\*\+\=\?\&\%\)\(\#\$\.\,\:\']/g, '');
// str = "  '''aaaaaa 'iiiiii''''mmmmmm:'bbbbbb''nnnnnn   'kkkkkk,'cccccc'jjjjjj'':'dddddd 'gggggg''hhhhhh',llllll''''eeeeee  '':'ffffff '  "

// trim the string for spaces and characters not ' at beginning and end
str = str.replace(/([^\']+(?!\'))$|^[^\']+(?=\')/g, '');
// str = "'''aaaaaa 'iiiiii''''mmmmmm:'bbbbbb''nnnnnn   'kkkkkk,'cccccc'jjjjjj'':'dddddd 'gggggg''hhhhhh',llllll''''eeeeee  '':'ffffff '"

// remove anything that is either multiple ' (ie. ''') or not ' (ie. abc) around all :
str = str.replace(/\'+[^\']*\:[^\']*\'+/g, "':'");
// str = "'''aaaaaa 'iiiiii'''':'bbbbbb''nnnnnn   'kkkkkk,'cccccc'jjjjjj'':'dddddd 'gggggg''hhhhhh',llllll''''eeeeee  '':'ffffff '"

// remove anything that is either multiple ' (ie. ''') or not ' (ie. abc) around all ,
str = str.replace(/\'+[^\']*\,[^\']*\'+/g, "','");
// str = "'''aaaaaa 'iiiiii'''':'bbbbbb''nnnnnn   ','cccccc'jjjjjj'':'dddddd 'gggggg''hhhhhh',''''eeeeee  '':'ffffff '"

// trim inside ''
str = str.replace(/\'\s+|\s+\'/g, "'");
// str = "'''aaaaaa'iiiiii'''':'bbbbbb''nnnnnn','cccccc'jjjjjj'':'dddddd'gggggg''hhhhhh',''''eeeeee'':'ffffff'"

// let all multiple ' (ie. ''') be 1 '
str = str.replace(/\'+/g, "'")
// str = "'aaaaaa'iiiiii':'bbbbbb'nnnnnn','cccccc'jjjjjj':'dddddd'gggggg'hhhhhh','eeeeee':'ffffff'"

// THE FIRST LOOKBEHIND - let all patterns '(anything not a ')': (ie. 'abcdef':) if it is not at the beginning of the string or preceded by a , be ':
while (str.match(/(?<!^|\,)\'[^\']+\'\:/g)) {
    str= str.replace(/(?<!^|\,)\'[^\']+\'\:/g, "':");
}
// loop 1: // str = "'aaaaaa':'bbbbbb'nnnnnn','cccccc':'dddddd'gggggg'hhhhhh','eeeeee':'ffffff'"
// no more matches so moving on

// THE SECOND LOOKBEHIND - let all patterns '(anything not a ')', (ie. 'abcdef',) if it is not at the beginning of the string or preceded by a : be ',
while (str.match(/(?<!^|\:)\'[^\']+\'\,/g)) {
    str= str.replace(/(?<!^|\:)\'[^\']+\'\,/g, "',");
}
// loop 1: // str = "'aaaaaa':'bbbbbb','cccccc':'dddddd'gggggg','eeeeee':'ffffff'"
// loop 2: // str = "'aaaaaa':'bbbbbb','cccccc':'dddddd','eeeeee':'ffffff'"
// no more matches so moving on

// return final string "'aaaaaa':'bbbbbb','cccccc':'dddddd','eeeeee':'ffffff'"
return str;

Now I know that the above series of RegExp can be done more elegantly, but those are not the droids I'm looking for.

The above code was tested in the latest versions of Edge, Firefox, and Chrome and, even though JavaScript throws an error, it still works just fine in those browsers even with the lookbehind.

But as this page states, RegExp lookbehind is only supported by browsers used by 76.49% of the internet and I have no interest in just 3/4 of people visiting my site being able to use part of it.

So I'm looking for a workaround for the lookbehind part of the RegExp above.

I have tried all the solutions listed here:

Which all basically boils down to either

  1. modify code to use lookahead,
  2. capture the preceding character along with the match and then replacing the preceding character with itself or
  3. do it serverside.

And leaving aside solution 2 (since I can't know what that character is - as the preceding character can be any of the allowed characters) and solution 3. (since no server side is involved in this transaction) when I try the lookahead methods suggested, they all involve changing the matching RegExp ie.

\'[^\']+\'\:

to match the new format with the lookahead. But to be quite honest I would have no idea where to even start to change it to match the pattern with lookahead instead.

These are the droids I'm looking for.

Given the above 2 patterns with lookbehind:

/(?<!^|\,)\'[^\']+\'\:/g

and

/(?<!^|\:)\'[^\']+\'\,/g

What would the new patterns with lookahead -- that does the same thing -- look like?


Solution

  • This pattern (?<!^|\,)\'[^\']+\'\: asserts not the start of the string or a , directly to the left. So there should be a char to the left other than ,

    You can write that using a capture group matching what comes before what you want to match to replace, and use the group in the replacement to keep it.

    Note that you don't have to escape the ' : and ,

    ([^,])'[^']+':
    

    For example

    str= str.replace(/([^,])'[^']+':/g, "$1':");
    

    You can do the same with

    str= str.replace(/([^:])'[^']+',/g, "$1',");