Search code examples
phpfilterfopenfwritecpu-word

PHP filter hashtags from string and write result back to file


I'm using fopen() and fwrite() to write some JSON content to a file.

My question: Is there a way to filter the content and write only specific words to that file?

E.g.: I retrive "I #love #love #love you so #much my dear #brother!" from the JSON file and I would like to write only the word #love and only one time to the file?

Here is an example of what I get in $message:

<p>👒👿 #follow4follow #followme #follow #smile #happy #instalike #instadaily #instagood #life4like #like #likeback #fashion #fun #like4like #sweettooth #spring #gopro #love #tbt</p>

This is my starting point ($message writes the whole phrase to the file):

$myfile = fopen("custom/hashtag.php", "a");
fwrite($myfile, "<p>" . $message . "</p>" . " \n\r");

/////////////////////////////////////////////
//updated as @insertusernamehere suggested://
/////////////////////////////////////////////

$message = $comment['message']; //i get this from my json

$whitelist = array('#love');

// get only specific hashtag
preg_match_all('/' . implode('|', $whitelist) . '/', $message, $matches);

$unique_matches = array_unique($matches[0]);

$final = implode(' ', $unique_matches); 

$myfile = fopen("custom/hashtag.php", "a");

// to avoid empty results
if (!empty($unique_matches)) { 
   fwrite($myfile, "<p class=\"hidden\">" . $final . "</p>" . " \n\r");
}

Solution

  • You can solve it like this:

    $message = 'I #love #love #love you so #much!';
    

    Get all hashtags using a regular expression

    preg_match_all('/#(\\w+)/', $message, $matches);
    

    Get only specific hashtags

    This is failsafe for similar tags like #love and #loveYou.

    $whitelist = array('love', 'stackoverflow');
    preg_match_all('/#\b(' . implode('|', $whitelist) . ')\b/', $message, $matches);
    

    Throw away duplicates

    $unique_matches = array_unique($matches[0]);
    

    Combine all hashtags using a whitespace for example

    print implode(' ', $unique_matches);
    // prints "#love #much"
    

    Alternatively if you want to filter the list by allowed tags afterwards

    // create a whitelist of hashtags
    $whitelist = array('#love', '#stackoverflow');
    // filter the result by this list
    $unique_matches_filtered = array_intersect($whitelist, $unique_matches);
    // prints only "#love"
    print implode(' ', $unique_matches_filtered);