I'm following up on a prior question I asked here: Calculating ratio of reciprocated ties for each node in igraph
The answers were very helpful, but I realized one of the calculations isn't coming out correctly. I'm trying to figure out the ratio of reciprocated edges to outdegree--in other words, what percentage of people I nominate as friends nominate me as a friend?
When students don't nominate friends (outdegree is 0), they're not included in my calculation of reciprocated ties. Since they can't have any reciprocated ties, I want their reciprocity to be calculated as 0. Their ratio of reciprocated ties/outdegree should also be 0.
Here's an example:
library(igraph)
###Creating sample edgelist###
from<- c("A", "A", "A", "B", "B", "B", "C", "D", "D", "E")
to<- c("B", "C", "D", "A", "E", "D", "A", "B", "C", "E")
weight<- c(1,2,3,2,1,3,2,2,1,1)
g2<- as.matrix(cbind(from,to, weight))
###Converting edgelist to network###
g3=graph.edgelist(g2[,1:2])
E(g3)$weight=as.numeric(g2[,3])
###Removing self-loop###
g3<-simplify(g3, remove.loops = T)
Here, E's indegree is 1 and outdegree is 0. I create a self-loop for E so the indegree and outdegree vectors remain the same length, and then remove it.
Next, I see which nominations are reciprocated:
recip<-is.mutual(g3)
recip<-as.data.frame(recip)
Then I create an edgelist from g3, and add recip
to the data frame:
###Creating edgelist and adding recipe###
edgelist<- get.data.frame(g3, what = "edges")
colnames(edgelist)<- c("from", "to", "weight")
edgelist<- cbind(edgelist, recip)
edgelist
> edgelist
from to weight recip
1 A B 1 TRUE
2 A C 2 TRUE
3 A D 3 FALSE
4 B A 2 TRUE
5 B D 3 TRUE
6 B E 1 FALSE
7 C A 2 TRUE
8 D B 2 TRUE
9 D C 1 FALSE
This is where the trouble begins. Since E isn't in from
, it's also not in the objects I create below.
Next, I create a table with outdegree and add vertex names:
##Creating outdegree and adding vertex IDs##
outdegree<- as.data.frame(degree(g3, mode="out"))
ID<-V(g3)$name
outdegree<-cbind(ID, outdegree)
colnames(outdegree) <- c("ID","outdegree")
rownames(outdegree)<-NULL
outdegree
Outdegree
comes out just as I want it:
ID outdegree
1 A 3
2 B 3
3 C 1
4 D 2
5 E 0
When I calculate the number of reciprocated ties for each node, E isn't included, since I use the from
column from edgelist
I discussed above.
##Calculating number of reciprocated ties##
recip<-aggregate(recip~from,edgelist,sum)
colnames(recip)<- c("ID", "recip")
recip
> recip
ID recip
1 A 2
2 B 2
3 C 1
4 D 1
So that's where the problem is. If try to create a table with the ratio of reciprocated ties to outdegree, E isn't included:
##Creating ratio table##
ratio<-merge(recip, outdegree, by= "ID")
ratio<-as.data.frame (recip$recip/ratio$outdegree)
ratio<- cbind(recip$ID, ratio)
colnames(ratio)<- c("ID", "ratio")
ratio
ID ratio
1 A 0.6666667
2 B 0.6666667
3 C 1.0000000
4 D 0.5000000
Ultimately, I want a row in ratio
for E that equals 0. Since the ratio here would be 0/0 (0 reciprocated ties/0 outdegree), I'd probably get an NaN but I can convert that to 0 easily, so that would be fine.
I could work around this and export the data to Excel, run the calculations by hand, and keep it easy. But that won't help improve my coding skills, and I have a bunch of networks to run, so it's also pretty inefficient.
Any thoughts on how to automate this?
Thanks again for your help.
E
is not showing up because E
is not in the column from
in the recip
data frame! It is only in to
.
You can aggregate
on both columns and then merge.
r1 <- aggregate(recip~from,edgelist,sum)
colnames(r1) <- c("ID", "recip")
r2 <- aggregate(recip~to,edgelist,sum)
colnames(r2) <- c("ID", "recip")
recip <- merge(r1,r2, all = T) # all = T gives the union of the df's
Which gives:
ID recip
1 A 2
2 B 2
3 C 1
4 D 1
5 E 0
Also, with piplining:
library(dplyr)
edgelist %>%
aggregate(recip~from,.,sum) %>%
rename(ID = from) %>%
merge(., edgelist %>%
aggregate(recip~to,.,sum) %>%
rename(ID = to),
all = T)