Search code examples
regexshortcode

a regex for cleaning quotes between quotes


I'm trying write a regex that clears double quotes inside double quotes of a shortcode attribute.

I wrote this regex

\="(.*?)\"

and it matches the string between quotes http://regex101.com/r/jW0uC4

But when I have attribute value that also contains double quotes it fails http://regex101.com/r/pL9bI0

So, how can i improve the regex as it will catch the string only between =" and last "

Thanks in advance


Solution

  • This regex matches the sample text you provided:

    /="(.*?)"(?=\s*(?:[a-z]+=|]))/
    

    Explanation:

      ="                       '="'
      (                        group and capture to \1:
        .*?                      any character except \n (0 or more times
                                 (matching the least amount possible))
      )                        end of \1
      "                        '"'
      (?=                      look ahead to see if there is:
        \s*                      whitespace (\n, \r, \t, \f, and " ") (0
                                 or more times (matching the most amount
                                 possible))
        (?:                      group, but do not capture:
          [a-z]+                   any character of: 'a' to 'z' (1 or
                                   more times (matching the most amount
                                   possible))
          =                        '='
         |                        OR
          ]                        ']'
        )                        end of grouping
      )                        end of look-ahead
    

    But user errors are hard to fix and this regex may not work in all cases (for example if text contains an = character). You should make sure user input is escaped properly.