Search code examples
phpregexpreg-replacepreg-matchpreg-quote

How to escape special characters that that are not gathered in a special shape


I have this string:

$var = "Foo (.* ) bar blla (.* ) b*l?a$ bla bla ";

I want to escape the * and ? and all special characters that are not gathered in this shape

"(.*)"

I wanted to use preg_quote($var, '\') But it escapes all the special characters, and I only need the single special characters to be escaped. I want this result:

$var = "Foo (.* ) bar bla(.*) b\*l\?a\$ bla bla ";

I want to use the final $var (the result) in a preg_match that matches all (.*) in an other string, and the special characters which are in my case theses :

., \, ^, $, |, ?, *, +, (, ), [, ], {, }, and /

should be parsed as a normal text so they should be escaped. while the (.*) one shouldn't be escaped. Only the special characters above should be escaped, because I will have to use $var in preg_match. The other special characters, no need to escape them.

preg_match("/" . $var . "/s", $anotherstring, $match);

Solution

  • Here are a few patterns that outperform ClasG's answer:

    Input: Foo (.* ) bar blla (.* ) b*l?a$ && bla bla

    Pattern: /\([^)]*\)(*SKIP)(*FAIL)|([^a-z\d ])/i Replace with: \\\1

    Output: Foo (.* ) bar blla (.* ) b\*l\?a\$ \&\& bla bla

    Pattern Demo (just 122 steps)

    Basically it just omits the "protected" parenthetical portion and matches any non-alphebetic & non-space characters.


    If you want to specifically list the symbols, you can just change the negated character class to the character class in the OP like this: (still 122 steps)

    /\([^)]*\)(*SKIP)(*FAIL)|([-\/~`!@#$%^&*()_+={}[\]|;:'"<>,.?\\])/
    

    or you can use only the symbols in your sample, here's the full pattern (still 122 steps):

    /\([^)]*\)(*SKIP)(*FAIL)|([*?$&])/
    

    All of ClasG's patterns are slower than my 3 patterns above:

    ClasG's written pattern: (?<!\(|\(.|\(..)([^\w\s])(?![^(]*\)) fails and takes 418 steps - demo

    ClasG's linked demo pattern: (?<!\(|\(.|\(..)([^\w\s])(?![^(]*\)) is correct but takes 367 steps - demo

    ClasG's third pattern: (?<!\(\.)([*?$&])(?!\)) is correct but has a strict requirement for the parenthetical portion. It is the best pattern in that answer taking 186 steps - demo.