Search code examples
javascriptregexreplacequoteslookbehind

Javascript do not match str surrounded with quotes


I'm writing a regular expression in javascript that replaces whitespaces except when:

  1. Some specific syntax is in front of the whitespace
  2. It's surrounded in both single as double quotes (escaped quotes within quotes excluded)

Now, I've got a big part working. It matches all patterns that doesn't have the specific syntax in front of the whitespace, however, I'm stuck with the quote part.

return str.replace(/(function|new|return|var)?\s/g, function($0, $1) {
    return $1 ? $0 : '';
});

I've done quite some testing, but I just can't figure it out. Thanks in advance.


Solution

  • You can use:

    var str = "foo  \"b a \\\" r\" new y 'l o l' foo lol; var x = new 'fo \\' o' ";
    
    var result = str.replace(/(function|new|return|var)?\s+(?=(?:[^\\"']|\\.)*(?:(?:"(?:[^\\"]|\\.)*"|'(?:[^\\']|\\.)*'))*(?:[^\\"']|\\.)*$)/gm,
    function($0, $1) { return $1 ? $0 : ''; });
    

    See http://jsfiddle.net/qCeC4/

    Lookahead part in Perl /x form:

    s/
    \s+
    (?=
        (?:[^\\"']|\\.)*
        (?:
            (?:
                "(?:[^\\"]|\\.)*"
                |
                '(?:[^\\']|\\.)*'
            )
        )*
        (?:[^\\"']|\\.)*$
    )
    //xmg;
    

    Note: As I said before, this is not a good way to parse JS, and will break on comments, regex quoting, and who knows what else.

    Note2: Forgot to add that this only works for "valid" quoting, all quotes must be closed.