Search code examples
phpiconv

Higlight keyword in text - optimization


I'm working on a small function which highlight keyword in text. Before I used regular expression to replace original text with the highlighted one which worked too, but recently I had time to rewrite this function. I need some assistance about performance vise or how can be improved. Anybody who have any ideas how to improve I would appreciate one help.

function highlight($search, $subject, $htmlTag = 'mark')
{
    if (empty($search) === true) {
        return $subject;
    }

    $searchParts = explode(' ', str_replace("'", '', iconv('UTF-8', 'ASCII//TRANSLIT', $search)));
    $subjectParts = explode(' ', str_replace("'", '', iconv('UTF-8', 'ASCII//TRANSLIT', $subject)));
    $originalSubject = explode(' ', $subject);
    $result = [];

    foreach ($subjectParts as $row => $subjectPart) {
        foreach ($searchParts as $searchPart) {
            if (false !== $pos = stripos($subjectPart, $searchPart)) {
                $result[] = mb_substr($originalSubject[$row], 0, $pos) . '<' . $htmlTag . '>' . mb_substr($originalSubject[$row], $pos, mb_strlen($searchPart)) . '</' . $htmlTag . '>' . mb_substr($originalSubject[$row], $pos + mb_strlen($searchPart));

                continue 2;
            }
        }

        $result[] = $originalSubject[$row];
    }

    return implode(' ', $result);
}

Edit: iconv needed because this function will replace text which contains accented characters.

Edit 2: example: highlight('prijimac HD815', 'Satelitný prijímač, Amiko HD8155'); result: "Satelitný prijímač, Amiko HD8155"


Solution

  • Here's what I would simply do:

    function prepare($pattern)
    {
      // Add any other accented character you wanna handle
      $replacements = [
        'a' => '[aáàäâ]',
        'c' => '[cč]',
        'e' => '[eéèëê]',
        'i' => '[ií]',
        'y' => '[yý]'
      ];
    
      return str_replace(array_keys($replacements), $replacements, $pattern);
    }
    
    function highlight($search, $subject, $htmlTag = 'mark')
    {
      $pattern = '/' . preg_replace('/\s+/', '|', prepare(preg_quote(trim($search)))) . '/u';
    
      return preg_replace($pattern, "<$htmlTag>$0</$htmlTag>", $subject);
    }
    

    Demo: https://3v4l.org/MUX9b