I am curious how to access additional attributes for a graph which are associated with the edges. To follow along here is a minimal example:
library("igraph")
library("SocialMediaLab")
myapikey =''
myapisecret =''
myaccesstoken = ''
myaccesstokensecret = ''
tweets <- Authenticate("twitter",
apiKey = myapikey,
apiSecret = myapisecret,
accessToken = myaccesstoken,
accessTokenSecret = myaccesstokensecret) %>%
Collect(searchTerm="#trump", numTweets = 100,writeToFile=FALSE,verbose=TRUE)
g_twitter_actor <- tweets %>% Create("Actor", writeToFile=FALSE)
c <- igraph::components(g_twitter_actor, mode = 'weak')
subCluster <- induced.subgraph(g_twitter_actor, V(g_twitter_actor)[which(c$membership == which.max(c$csize))])
The initial tweets contains the following columns
colnames(tweets)
[1] "text" "favorited" "favoriteCount" "replyToSN" "created_at" "truncated" "replyToSID" "id"
[9] "replyToUID" "statusSource" "screen_name" "retweetCount" "isRetweet" "retweeted" "longitude" "latitude"
[17] "from_user" "reply_to" "users_mentioned" "retweet_from" "hashtags_used"
How can I access the text property for the subgraph in order to perform text analysis?
E(subCluster)$text
does not work
E(subCluster)$text
does not work because the values for tweets$text
are not added to the graph when it is made. So you have to do that manually. It's a bit of a pain, but doable. Requires some subsetting of the tweets
data frame and matching based on user names.
First, notice that the edge types are in a particular order: retweets, mentions, replies. The same text from a particular user can apply to all three of these. So I think it makes sense to add text serially.
> unique(E(g_twitter_actor)$edgeType)
[1] "Retweet" "Mention" "Reply"
Using dplry
and reshape2
makes this easier.
library(reshape2); library(dplyr)
#Make data frame for retweets, mentions, replies
rts <- tweets %>% filter(!is.na(retweet_from))
ms <- tweets %>% filter(users_mentioned!="character(0)")
rpls <- tweets %>% filter(!is.na(reply_to))
Since users_mentioned
can contain a list of individuals, we have to unlist it. But we want to associate the users mentioned with the user who mentioned them.
#Name each element in the users_mentioned list after the user who mentioned
names(ms$users_mentioned) <- ms$screen_name
ms <- melt(ms$users_mentioned) #melting creates a data frame for each user and the users they mention
#Add the text
ms$text <- tweets[match(ms$L1,tweets$screen_name),1]
Now add each of these to the network as an edge attribute by matching the edge type.
E(g_twitter_actor)$text[E(g_twitter_actor)$edgeType %in% "Retweet"] <- rts$text
E(g_twitter_actor)$text[E(g_twitter_actor)$edgeType %in% "Mention"] <- ms$text
E(g_twitter_actor)$text[E(g_twitter_actor)$edgeType %in% "Reply"] <- rpls$text
Now you can subset and get the edge value for text.
subCluster <- induced.subgraph(g_twitter_actor,
V(g_twitter_actor)[which(c$membership == which.max(c$csize))])