Search code examples
rtwitterstringrrtweet

Returning Twitter handles per dataframe row


Given the following dataframe:

df <- as.data.frame(c("Testing @cspenn @test @hi","this is a tweet","this is a tweet with @mention of @twitter"))
names(df)[1] <- "content"

I'm trying to extract the individual twitter handles per row, instead of all at once.

From this example, I have this function which spits them all out, but I need them to remain contained to each row.

df$handles <- plyr::ddply(df, c("content"), function(x){
    mention <- unlist(stringr::str_extract_all(x$content, "@\\w+"))
    # some tweets do not contain mentions, making this necessary:
    if (length(mention) > 0){
        return(data.frame(mention = mention))
    } else {
        return(data.frame(mention = NA))    
    }
})

How I extract the handles only per row, instead of all at once?


Solution

  • library(tidyverse)
    
    df %>%
      mutate(mentions = str_extract_all(content, "@\\w+"))
    

    Output:

                                        content            mentions
    1                 Testing @cspenn @test @hi @cspenn, @test, @hi
    2                           this is a tweet                    
    3 this is a tweet with @mention of @twitter  @mention, @twitter