Search code examples
javascriptregexparsingtext

Javascript - replace a word within a string unless that word is contained within quotes


I am writing some code that parses some SQL expressions, but have run into a problem I cannot solve on my own. I want to replace a certain word in a SQL statement but not if that word is contained within a phrase that is enclosed in single or double quotes. My use case is SQL expression parsing but I think this is more a generic string replacement thing.

To take an example consider the expression: select {foo} from Table where {foo} = "This is {foo} today" or {foo} = 'this is {foo} tomorrrow' or {foo} = "it's all '{foo}' to me!"

Assuming I want to replace {foo} to the string bar, the output would be: select bar from Table where bar = "This is {foo} today" or bar = 'this is {foo} tomorrow' or bar = "it's all '{foo}' to me!"

As we can see all {foo} expressions enclosed within quotes (single or double) have not been replaced.

We can make the assumption that quotes will be closed, i.e. there will be no stray quotes floating around (where {foo} = 'un'even" is not a use case we need to consider.)

Text within nested quotes should not be replaced (as however you look at it the text is contained within quotes :) ) The example shows an example of this in the or {foo} = "it's all '{foo}' to me!" part (as well as containing three single quotes just for fun)

I have done quite a bit of research on this and it seems a tricky thing to do in Javascript (or any other language no doubt). This seems a good fit for regex, but any javascript solution regex or not would be helpful. The closest I have come in Stack Overflow to a solution is Don't replace regex if it is enclosed by a character but it isn't a close enough match to help


Solution

  • For the example string using JavaScript, you might use a single pattern with a callback a capture group to check for.

    (['"]).*?\1|({foo})
    

    The pattern matches:

    • (['"]) Group 1, capture either " or '
    • .*? Match any char, as few as possible
    • \1 Match the same char as captured in group 1
    • | Or
    • ({foo}) Capture {foo} in group 2

    Regex demo

    In the callback check for group 2. If it is there, return it else return the whole match.

    const regex = /(['"]).*?\1|({foo})/g;
    const str = `select {foo} from Table where {foo} = "This is {foo} today" or {foo} = 'this is {foo} tomorrrow'  or {foo} = "it's all '{foo}' to me!"`;
    const result = str.replace(regex, (m, _, g2) => g2 ? "bar" : m);
    console.log(result);