Search code examples
rggplot2ggraph

Problem using ggraph for plotting bigrams co-occurrence, some lines don't show up


I am trying to plot the co-occurrences of bigrams (from Text Mining with R)like this: enter image description here

Or like this: enter image description here

But following exactly the same codes given in the book, my plots are missing much of the lines and colors. Not sure if it is because I have missed out some important steps or I'm missing certain packages.

Below is a simpler version for illustration:

library(dplyr)
library(ggplot2)
library(igraph)
library(ggraph)

terms <- sample(letters[1:10],50,replace=T)
count <- sample(1:50,25,replace=T)

bigrams <- data_frame(term1=terms[1:25],term2=terms[26:50],occur=count) %>%
  arrange(desc(occur)) %>%
  graph_from_data_frame()

a <- grid::arrow(type = "closed", length = unit(.15, "inches"))

And I'm getting plots that's just not right (even the legend is not shown properly):

ggraph(bigrams, layout = "fr") +
  geom_edge_link(aes(edge_alpha = occur), show.legend = FALSE,  
                 arrow = a, end_cap = circle(.07, 'inches')) +
  geom_node_point(color = "lightblue", size = 5) +
  geom_node_text(aes(label = name), vjust = 1, hjust = 1) +
  theme_void()

enter image description here

ggraph(bigrams, layout = "fr") +
  geom_edge_link(aes(edge_alpha = occur, edge_width = occur), edge_colour = "cyan4") +
  geom_node_point(size = 5) +
  geom_node_text(aes(label = name), repel = TRUE, 
                 point.padding = unit(0.2, "lines")) +
  theme_void()

enter image description here

Ok this is funny but removing the theme_void() solves all. I suppose it does something different when the book is been written. However the legend in the second graph is still not showing, so there is still something wrong:

enter image description here enter image description here


Solution

  • I've found the ggraph package nice but with some issues. To me, your code work, if you zoom on the plot in RStudio.
    However, I advice you some small mods, that make the plot without zooming:

    ggraph(bigrams, layout = "fr") +
      geom_edge_link(aes(width = occur),          # seems the alpha creates problem with legend
                     colour = "cyan4") +
      geom_node_point(size = 5) +
      scale_edge_width(range = c(0.2, 2)) +       # rescale the edges
      geom_node_text(aes(label = name), repel = TRUE, point.padding = unit(0.2, "lines"))+
     theme_graph()                                # made for graph
    

    enter image description here

    If you want the alpha, you can try this, but I noticed you see the legend only zooming in RStudio:

    enter image description here


    The data are the same of yours, but with set.seed(1).