Search code examples
rdataframeigraph

How to have one line for duplicates value in igraph plot in R?


I have a data frame with three columns that want to have igraph plot for it. The first column has duplicate values when I visualize by igraph it makes two lines. But, I want to be just one line for duplicates values.

this is the reproducible data:

dput(sample)
structure(list(NMSUKU = c("Aceh/ Achin/ Akhir/ Asji/ A-Tse/ Ureung Aceh", 
"Alas", "Aneuk Jamee", "Gayo", "Gayo Lut", "Gayo Luwes", "Gayo Serbe Jadi", 
"Kluet", "Sigulai", "Simeulue", "Simeulue", "Simeulue", "Singkil", 
"Singkil", "Tamiang"), TopLang = c("Aceh/ Acheh/ Achi ", "Alas ", 
"Aceh Jamee ", "Gajo/ Gayo ", "Gajo/ Gayo ", "Gajo/ Gayo ", "Gajo/ Gayo ", 
"Aceh Kluet ", "ERROR  TopCol out of range ", "Long Bano/ Simalur/ Simeuloe/ Simeulue/ Simulul ", 
"Aceh Simeleu Barat ", "Aceh Simeleu Tengah ", "Aceh Hulu Singkil ", 
"Aceh Hulu Singkil ", "Tamiang "), Ethnicity = c("1_Aceh/ Achin/ Akhir/ Asji/ A-Tse/ Ureung Aceh  ", 
"2_Alas  ", "3_Aneuk Jamee  ", "4_Gayo  ", "6_Gayo Luwes  ", 
"5_Gayo Lut  ", "7_Gayo Serbe Jadi  ", "8_Kluet  ", "NA  ", "10_Simeulue  ", 
"10_Simeulue  ", "10_Simeulue  ", "11_Singkil  ", "17_Batak Pakpak Dairi  ", 
"12_Tamiang  ")), row.names = c(NA, -15L), class = "data.frame")

I ran these codes:

m <- as.matrix(sample)
g <- graph_from_edgelist(rbind(m[,1:2], m[,2:3]), directed = TRUE)
l <- layout_with_sugiyama(g)
plot(g, layout=-l$layout[,2:1],
     edge.arrow.size = 0.1,
     vertex.size = 2.5,
     vertex.color = "grey",
     vertex.label.dist = 1,
     edge.arrow.width = 1.5,
     edge.width = seq(0.5,0.08),
     edge.lty = "solid",
     edge.color = "gray",
     vertex.label.cex = 0.7,
     is.rm = TRUE,
     vertex.label.color = "black")

This is what I got enter image description here

I want to have one line from Singkil to Aceh Hulu Singkil


Solution

  • igraph::simplify() is great for this.

    Modifying your plot() call as follows draws only a single line where you previously had two.

    plot(simplify(g), layout=-l$layout[,2:1],
         edge.arrow.size = 0.1,
        vertex.size = 2.5,
         vertex.color = "grey",
         vertex.label.dist = 1,
         edge.arrow.width = 1.5,
         edge.width = seq(0.5,0.08),
         edge.lty = "solid",
         edge.color = "gray",
         vertex.label.cex = 0.7,
         is.rm = TRUE,
         vertex.label.color = "black")
    

    If you add a weight vector to the igraph object before simplifying the edges count is aggregated in the edge weight and you can use it in the visualization. Below will result in a slightly thicker line for the edge in question.

    E(g)$weight <- 1
    g <- simplify(g, edge.attr.comb = "sum")
    plot(g, layout=-l$layout[,2:1],
         edge.arrow.size = 0.1,
         vertex.size = 2.5,
         vertex.color = "grey",
         vertex.label.dist = 1,
         edge.arrow.width = 1.5,
         edge.width = E(g)$weight,
         edge.lty = "solid",
         edge.color = "gray",
         vertex.label.cex = 0.7,
         is.rm = TRUE,
         vertex.label.color = "black")