Search code examples

R : how to control behaviour of edges in ggraph

I'm facing this issue: I got some data like these:


edges <- data.frame(a=c('k','k','k','k','k','z','z'),
                    b=c('b','b','b','b','c','b','c'), costant = 1)
  a b costant
1 k b       1
2 k b       1
3 k b       1
4 k b       1
5 k c       1
6 z b       1
7 z c       1

Now I would lik to have a graph with ggraph that have nodes and edges with weights. So I worked this way:

# first I calculated the edges weights
edges1 <- edges%>% group_by(a,b) %>% summarise(weight = sum(costant))
> edges1
# A tibble: 4 x 3
# Groups:   a [?]
  a     b     weight
  <fct> <fct>  <dbl>
1 k     b          4
2 k     c          1
3 z     b          1
4 z     c          1

Then the nodes:

nodes <- rbind(data.frame(word = edges$a, n = 1),data.frame(word = edges$b, n = 1)) %>%
 group_by(word) %>%
summarise(n = sum(n))
> nodes
# A tibble: 4 x 2
  word      n
  <fct> <dbl>
1 k         5
2 z         2
3 b         5
4 c         2

Till now, everything works fine. Now, following this as example:

tidy <- tbl_graph(nodes = nodes, edges = edges1, directed = T)
tidy <- tidy %>% 
  activate(edges) %>% 

Suddently I plotted the graph:

ggraph(tidy, layout = "gem") + 
  geom_node_point(aes(size=n)) +
  geom_edge_link(aes(width = weight), alpha = 0.8) + 
  scale_edge_width(range = c(0.2, 2)) +
  geom_text_repel(aes(x = x, y=y , label=word)) 

But the result is this:

enter image description here

And I cannot figure out why there is a line between k and z, because that edges does not exists.

Thank in advance.


  • It seems it's due to the fact that tbl_graph converts edge1 tibble's nodes from factor to integer by as.integer without considering the nodes tibble, this is source of the error. If we pre-convert the edge node's to integers correctly it will work as expected.

    edges <- data.frame(a=c('k','k','k','k','k','z','z'),
                        b=c('b','b','b','b','c','b','c'), costant = 1)
    edges1 <- edges%>% group_by(a,b) %>% summarise(weight = sum(costant))
    nodes <- rbind(data.frame(word = edges$a, n = 1),data.frame(word = edges$b, n = 1)) %>%
      group_by(word) %>%
      summarise(n = sum(n))
    edges2 <- edges1 # save edges with factor node labels into edge2
    # convert 'from' and 'to' factor columns to integer columns correctly 
    # with the nodes tibble's corresponding matched index values 
    edges1$a <- match(edges1$a, nodes$word) 
    edges1$b <- match(edges1$b, nodes$word)
    tidy <- tbl_graph(nodes = nodes, edges = edges1, directed = T)
    tidy <- tidy %>% 
      activate(edges) %>% 
    ggraph(tidy, layout = "gem") + 
       geom_node_point(aes(size=n)) +
       geom_edge_link(aes(width = weight), arrow = arrow(length = unit(4, 'mm')), end_cap = circle(3, 'mm'), alpha = 0.8) + 
       scale_edge_width(range = c(0.2, 2)) +
       geom_text_repel(aes(x = x, y=y , label=word)) 
    edges2 # compare the edges in the following tibble with the next figure
    # A tibble: 4 x 3
    # Groups:   a [?]
        a     b     weight
      <fct> <fct>  <dbl>
    #1 k     b       4
    #2 k     c       1
    #3 z     b       1
    #4 z     c       1

    enter image description here