Search code examples
phpperformancepreg-match-allstrpos

Which is more efficient between str_pos and preg_match?


After this question: Pattern for check single occurrency into preg_match_all

I understand that my pattern must contain only a word per cycle because, in the case reported in that question, I must find "microsoft" and "microsoft exchange" and I can't modify my regexp because these two possibilities are given dinamically from a database!

So my question is: which is the better solution between over 200 preg_match and the same numbers of str_pos to check if a subset of char contains these words?

I'm trying to write the possible code for both solution:

$array= array(200+ values);
foreach ($array as $word)
{
    $pattern='<\b(?:'.$word.')\b>i';
    preg_match_all($pattern, $text, $matches);
    $fields['skill'][] = $matches[0][0];
}

the alternative is:

$array= array(200+ values);
foreach ($array as $word)
{
    if(str_pos($word, $text)>-1)
    {
    fields['skill'][] = $word;
    }
}

Solution

  • strpos is much more fast than preg_match, here is a benchmark:

    $array = array();
    for($i=0; $i<1000; $i++) $array[] = $i;
    $nbloop = 10000;
    $text = <<<EOD
    I understand that my pattern must contain only a word per cycle because, in the case reported in that question, I must find "microsoft" and "microsoft exchange" and I can't modify my regexp because these two possibilities are given dinamically from a database!
    
    So my question is: which is the better solution between over 200 preg_match and the same numbers of str_pos to check if a subset of char contains these words?
    EOD;
    
    $start = microtime(true);
    for ($i=0; $i<$nbloop; $i++) {
        foreach ($array as $word) {
            $pattern='<\b(?:'.$word.')\b>i';
            if (preg_match_all($pattern, $text, $matches)) {
                $fields['skill'][] = $matches[0][0];
            }
        }
    }
    echo "Elapse regex: ", microtime(true)-$start,"\n";
    
    
    $start = microtime(true);
    for ($i=0; $i<$nbloop; $i++) {
        foreach ($array as $word) {
            if(strpos($word, $text)>-1) {
                $fields['skill'][] = $word;
            }
        }
    }
    echo "Elapse strpos: ", microtime(true)-$start,"\n";
    

    Output:

    Elapse regex: 7.9924139976501
    Elapse strpos: 0.62015008926392
    

    It's about 13 times faster.