I have a undirected weighted graph where I want to calculate the closeness measure for. As per igraph
documentation, it is the reciprocal of average shortest paths. I compute the shortest paths and inverse their average but still don't get the same value as in closeness
function. Why is this happening? What am I missing?
Here's my code:
dput(c$estimate)
structure(c(1, 10000, 10000, 2.69857209553848, 5.77115055524614,
1.95672007809809, 2.98690863617922, 1.92161847347611, 10000,
10000, 10000, 10000, 1, 1.97201563662035, 5.4078452590091, 10000,
6.8534542161595, 3.51453278996925, 10000, 10000, 2.08964950396744,
10000, 10000, 1.97201563662034, 1, 2.78868220464485, 10000, 3.41857460835551,
10000, 1.96044036389546, 10000, 10000, 10000, 2.69857209553835,
5.40784525900909, 2.78868220464486, 1, 10000, 10000, 3.54317409176484,
10000, 2.33889236077342, 10000, 10000, 5.77115055524604, 10000,
10000, 10000, 1, 10000, 10000, 10000, 10000, 10000, 10000, 1.95672007809807,
6.85345421615961, 3.41857460835555, 10000, 10000, 1, 10000, 10000,
2.49075030691086, 10000, 10000, 2.98690863617922, 3.51453278996926,
10000, 3.54317409176474, 10000, 10000, 1, 10000, 10000, 10000,
1.73687483250751, 1.92161847347613, 10000, 1.96044036389548,
10000, 10000, 10000, 10000, 1, 4.24032760636799, 3.11756167665886,
5.07827243244947, 10000, 10000, 10000, 2.33889236077345, 10000,
2.49075030691088, 10000, 4.24032760636804, 1, 10000, 1.69643890905686,
10000, 2.08964950396742, 10000, 10000, 10000, 10000, 10000, 3.11756167665892,
10000, 1, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 1.73687483250752,
5.0782724324492, 1.69643890905687, 10000, 1), .Dim = c(11L, 11L
), .Dimnames = list(c("jpm", "gs", "ms", "bofa", "schwab", "brk",
"wf", "citi", "amex", "spgl", "pnc"), c("jpm", "gs", "ms", "bofa",
"schwab", "brk", "wf", "citi", "amex", "spgl", "pnc")))
g <- graph_from_adjacency_matrix(c$estimate, weighted="wt", mode="undirected", diag=F)
closeness(g,weights= round(E(g)$wt,2))
jpm gs ms bofa schwab brk wf citi
0.02503756 0.01877229 0.02203614 0.02151463 0.01088495 0.02189621 0.02226180 0.02418380
amex spgl pnc
0.01988072 0.01632387 0.01913509
# manual
a <- shortest.paths(g,weights=round(E(g)$wt,2))
1/rowMeans(a)
jpm gs ms bofa schwab brk wf citi amex
0.2799695 0.2143414 0.2435245 0.2457002 0.1205876 0.2408583 0.2448798 0.2660218 0.2276490
spgl pnc
0.1855914 0.2140078
There are two places you may need to be aware of:
normalized = TRUE
in closeness
vcount(g)-1
is the denominator for averaging, instead of vcount(g)
, and that's why should shouldn't use rowMeans
.From the code below, you can see that the results by two methods are close to each other (minor difference might come from the precision, but I am not sure)
> closeness(g,weights = E(g)$wt,normalized = TRUE)
jpm gs ms bofa schwab brk wf citi
0.2504451 0.1876864 0.2203154 0.2151935 0.1088503 0.2190827 0.2226391 0.2418350
amex spgl pnc
0.1988941 0.1632546 0.1914826
> (vcount(g) - 1) / rowSums(shortest.paths(g, weights = E(g)$wt))
jpm gs ms bofa schwab brk wf citi
0.2545725 0.1947856 0.2213624 0.2234093 0.1096228 0.2190827 0.2226391 0.2418350
amex spgl pnc
0.2070431 0.1687258 0.1946688