After this question: Pattern for check single occurrency into preg_match_all
I understand that my pattern must contain only a word per cycle because, in the case reported in that question, I must find "microsoft" and "microsoft exchange" and I can't modify my regexp because these two possibilities are given dinamically from a database!
So my question is: which is the better solution between over 200 preg_match and the same numbers of str_pos to check if a subset of char contains these words?
I'm trying to write the possible code for both solution:
$array= array(200+ values);
foreach ($array as $word)
{
$pattern='<\b(?:'.$word.')\b>i';
preg_match_all($pattern, $text, $matches);
$fields['skill'][] = $matches[0][0];
}
the alternative is:
$array= array(200+ values);
foreach ($array as $word)
{
if(str_pos($word, $text)>-1)
{
fields['skill'][] = $word;
}
}
strpos
is much more fast than preg_match
, here is a benchmark:
$array = array();
for($i=0; $i<1000; $i++) $array[] = $i;
$nbloop = 10000;
$text = <<<EOD
I understand that my pattern must contain only a word per cycle because, in the case reported in that question, I must find "microsoft" and "microsoft exchange" and I can't modify my regexp because these two possibilities are given dinamically from a database!
So my question is: which is the better solution between over 200 preg_match and the same numbers of str_pos to check if a subset of char contains these words?
EOD;
$start = microtime(true);
for ($i=0; $i<$nbloop; $i++) {
foreach ($array as $word) {
$pattern='<\b(?:'.$word.')\b>i';
if (preg_match_all($pattern, $text, $matches)) {
$fields['skill'][] = $matches[0][0];
}
}
}
echo "Elapse regex: ", microtime(true)-$start,"\n";
$start = microtime(true);
for ($i=0; $i<$nbloop; $i++) {
foreach ($array as $word) {
if(strpos($word, $text)>-1) {
$fields['skill'][] = $word;
}
}
}
echo "Elapse strpos: ", microtime(true)-$start,"\n";
Output:
Elapse regex: 7.9924139976501
Elapse strpos: 0.62015008926392
It's about 13 times faster.