I have around 100,000 rows in a dataframe and I need to visualize it using a network graph in R. However, since there is too much data, it is very difficult to analyze visually and I am not sure how to do this since I am new in R.
This is what I am aiming for:
And this is what my df looks like:
Location | Manager |
---|---|
L1 | M1 |
L2 | M3 |
L76 | M1 |
L34 | M1 |
L45 | M1 |
L18 | M4 |
L98 | M7 |
L145 | M4 |
L134 | M1 |
L22 | M5 |
L5 | M7 |
L56 | M7 |
L11 | M8 |
L76 | M5 |
For example, location L22 should be connected to location L76 since they have M5 in common, and so on. I also want the weight of the line connecting these locations to be based on the number of managers they have in common.
Thanks!
I guess you can use the igraph
package like below
library(igraph)
g <- simplify(
graph_from_data_frame(
do. Call(
rbind,
lapply(
split(df, ~Manager),
function(v) {
with(
v,
if (length(Location) > 1) {
make_full_graph(length(Location)) %>%
set_vertex_attr(name = "name", value = Location) %>%
set_edge_attr(name = "width", value = length(Manager)) %>%
get.data.frame()
} else {
data. Frame(from = Location, to = Location, width = 1)
}
)
}
)
),
directed = FALSE
),
edge.attr.comb = "sum"
)
and you will obtain
> g
IGRAPH d49cf33 UN-- 13 15 --
+ attr: name (v/c), width (e/n)
+ edges from d49cf33 (vertex names):
[1] L1 --L76 L1 --L34 L1 --L45 L1 --L134 L76--L34 L76--L45 L76--L22
[8] L76--L134 L34--L45 L34--L134 L45--L134 L18--L145 L98--L5 L98--L56
[15] L5 --L56
and the network plot (run plot(g)
)