Search code examples
regextwitterpattern-matchingpreg-match

Regex Valid Twitter Mention


I'm trying to find a regex that matches if a Tweet it's a true mention. To be a mention, the string can't start with "@" and can't contain "RT" (case insensitive) and "@" must start the word.

In the examples I commented the desired output

Some examples:

function search($strings, $regexp) {
    $regexp;
    foreach ($strings as $string) {
        echo "Sentence: \"$string\" <- " .
        (preg_match($regexp, $string) ? "MATCH" : "NO MATCH") . "\n";
    }
}

$strings = array(
"Hi @peter, I like your car ", // <- MATCH
"@peter I don't think so!", //<- NO MATCH: the string it's starting with @ it's a reply
"Helo!! :@ how are you!", // NO MATCH <- it's not a word, we need @(word) 
"Yes @peter i'll eat them this evening! RT @peter: hey @you, do you want your pancakes?", // <- NO MATCH "RT/rt" on the string , it's a RT
"Helo!! [email protected] how are you!", //<- NO MATCH, it doesn't start with @
"@peter is the best friend you could imagine. RT @juliet: @you do you know if @peter it's awesome?" // <- NO MATCH starting with @ it's a reply and RT
);
echo "Example 1:\n";
search($strings,  "/(?:[[:space:]]|^)@/i");

Current output:

Example 1:
Sentence: "Hi @peter, I like your car " <- MATCH
Sentence: "@peter I don't think so!" <- MATCH
Sentence: "Helo!! :@ how are you!" <- NO MATCH
Sentence: "Yes @peter i'll eat them this evening! RT @peter: hey @you, do you want your pancakes?" <- MATCH
Sentence: "Helo!! [email protected] how are you!" <- MATCH
Sentence: "@peter is the best friend you could imagine. RT @juliet: @you do you know if @peter it's awesome?" <- MATCH

EDIT:

I need it in regex beacause it can be used on MySQL and anothers languages too. Im am not looking for any username. I only want to know if the string it's a mention or not.


Solution

  • Here's a regex that should work:

    /^(?!.*\bRT\b)(?:.+\s)?@\w+/i
    

    Explanation:

    /^             //start of the string
    (?!.*\bRT\b)   //Verify that rt is not in the string.
    (?:.*\s)?      //Find optional chars and whitespace the
                      //Note: (?: ) makes the group non-capturing.
    @\w+           //Find @ followed by one or more word chars.
    /i             //Make it case insensitive.