Search code examples
phppreg-matchlevenshtein-distance

how to preg match a string with a levenshtein distance in PHP


How can I preg match a string, but tolerate a variable levensthein distance in the pattern?

$string = 'i eat apples and oranges all day long';
$find = 'and orangis';
$distance = 1;
$matches = pregMatch_withLevensthein($find, $distance, $string);

This would return 'and oranges';


Solution

  • By converting the search string into a regexp, we can match the pattern. Then we search using that regexp and do a comparison with levenshtein. If it matches the bounds we can return the values.

    $string = 'i eat apples and oranges all day long';
    $find = 'and orangis';
    $distance = 1;
    $matches = preg_match_levensthein($find, $distance, $string);
    var_dump($matches);
    
    function preg_match_levensthein($find, $distance, $string)
    {
        $found = array();
    
        // Covert find into regex
        $parts = explode(' ', $find);
        $regexes = array();
        foreach ($parts as $part) {
            $regexes[] = '[a-z0-9]{' . strlen($part) . '}';
        }
        $regexp = '#' . implode('\s', $regexes) . '#i';
    
        // Find all matches
        preg_match_all($regexp, $string, $matches);
    
        foreach ($matches as $match) {
            // Check levenshtein distance and add to the found if within bounds
            if (levenshtein($match[0], $find) <= $distance) {
                $found[] = $match[0];
            }
        }
    
        // return found
        return $found;
    }