Search code examples
phpregexdoubledouble-quotes

PHP preg_match double quote matching


I'm a newbie, and a bit of a regex dullard, so please forgive me what will appear to be a completely stupid question.

In PHP I'm trying to use a regex to match when a double quote " appears in text from a control on an HTML form and allow it to be used. The current regex isn't working:

preg_match('/[^a-zA-Z0-9 \"\'\?\-]/', $v)

as all the other characters work ok, but if I put a " in the text, it still fails the regex.

I've tried [^a-zA-Z0-9 \"\'\?\-] on https://regex101.com/, and it seems to work ok. Is there something wrong with my PHP instance that needs fixing, or is PHP in some way not working consistently with https://regex101.com/?

Ian J.

Edit:

Input: test"

Output: 10

Edit:

$v = test"
$n = 50
$s = Name:
$f = $fail (which is passed by reference as a counter)

function validate_text($v, $n, $s, &$f)
{
    if ($v == "")
    {
        ++$f;
        return "<span class='error'>".$s."</span>";
    }
    elseif ((strlen($v) > $n) || preg_match('/[^a-zA-Z0-9 \"\'\?\-]/', $v))
    {
        ++$f;
        return "<span class='error'>".$s."</span>&nbsp;<span class='errorextra'>(Please enter only upper or lower case letters, numerals, spaces, and basic punctuation, maximum ".$n." characters)</span>";
    }
    return $s;
}

Edit: OK, it appears that there is something odd happening between the $_POST value and it's passing to a variable. I will have to investigate and get back. But for now, this question is on hold.

Edit: some initial investigation points to a conversion occuring in a call to htmlentities earlier in the code doing a conversion of the double quote to something else. Therefore I don't think this is a regex problem. I've marked 'beiller' as the answer due to his code example putting me on the path to finding where the problem actually is.


Solution

  • Your question is a bit confusing so let me describe what your regular expression does:

    preg_match('/[^a-zA-Z0-9 \"\'\?\-]/', $v)
    

    It will match any string which DOES NOT contain a-zA-Z0-9 \"\'\?\-

    Also you are escaping your " with \" which is not necessary. Try removing the back slash.

    The input test" should not be matched by this regex because it contains the letter "t".

    I made another attempt but answered too quickly. Try the following code:

    $v = 'test"';
    $n = 50;
    $s = 'Name:';
    $f = 0;
    
    function validate_text($v, $n, $s, &$f)
    {
        if ($v == "")
        {
            ++$f;
            return "<span class='error'>".$s."</span>";
        }
        elseif ((strlen($v) > $n) || preg_match('/[^a-zA-Z0-9 "\'\?\-]/', $v))
        {
            ++$f;
            return "<span class='error'>".$s."</span>&nbsp;<span class='errorextra'>(Please enter only upper or lower case letters, numerals, spaces, and basic punctuation, maximum ".$n." characters)</span>";
        }
        return $s;
    }
    
    echo validate_text($v, $n, $s, $f);
    

    Output:

    Name: