I am trying write a StringMatch function that returns true when words from one string can be found in another string. The exception is that I don't want matches for things like plurals and other suffixes, and I would also like to avoid matching when a word is prefixed.
To explain more visually:
apple watch - apple watches (no match)
apple watch - apple watch repairs (match)
apple watch - new apple watch (match)
apple watch - pineapple watch (no match)
I would like is this:
echo StringMatch("apple watch", "apple watches"); // output 0
echo StringMatch("apple watch", "apple watch repairs"); // output 1
echo StringMatch("apple watch", "new apple watch"); // output 1
echo StringMatch("apple watch", "pineapple watch"); // output 0
I have had some basic success with strpos() I cannot figure out how to return "0" when the second string contains suffixes or prefixes as per examples above.
Here is how I'm trying to solve it:
function StringMatch($str1,$str2)
{
if (SomeFunctionOrRegex($str1,$str2) !== false)
{
return(1);
}
else
{
return(0);
}
}
Perhaps there is a graceful regex solution. I have tried strpos() but it is not strict enough for my needs.
Like this as I said in the comments
function StringMatch($str1,$str2)
{
return preg_match('/\b'.preg_quote($str1,'/').'\b/i', $str2);
}
echo StringMatch("apple watch", "apple watches"); // output 0
echo "\n";
echo StringMatch("apple watch", "apple watch repairs"); // output 1
echo "\n";
echo StringMatch("apple watch", "new apple watch"); // output 1
echo "\n";
echo StringMatch("apple watch", "pineapple watch"); // output 0
echo "\n";
Output:
0
1
1
0
Preg Quote in necessary to avoid issues where $str1
could contain things like .
which in Regex is any character.
Furthermore you could strip punctuation like this
$str1 = preg_replace('/[^\w\s]+/', '', $str1);
For example:
echo StringMatch("apple watch.", "apple watch repairs"); // output 1
Without removing the punctuation, this will return 0. Rather or not that is important is up to you.
UPDATE
Match out of order, for example:
//words out of order
echo StringMatch("watch apple", "new apple watch"); // output 1
The easy way is implode/explode:
function StringMatch($str1,$str2)
{
//use one or the other
$str1 = preg_replace('/[^\w\s]+/', '', $str1);
//$str1 = preg_quote($str1,'/');
$words = explode(' ', $str1);
preg_match_all('/\b('.implode('|',$words).')\b/i', $str2, $matches);
return count($words) == count($matches[0]) ? '1' : '0';
}
You can also skip the explode/implode and use
$str1 = preg_replace('/\s/', '|', $str1);
Which can be combined to the other preg_replace
$str1 = preg_replace(['/[^\w\s]+/','/\s/'], ['','|'], $str1);
Or all together
function StringMatch($str1,$str2)
{
$str1 = preg_replace(['/[^\w\s]+/','/\s/'], ['','|'], $str1);
preg_match_all('/\b('.$str1.')\b/i', $str2, $matches);
return (substr_count($str1, '|')+1) == count($matches[0]) ? '1' : '0';
}
But then of course you can't count the words array, but you can count the number of |
pipes which is 1 less then the number of words (hence the +1). That is if you care that all the words match.