Search code examples
rggplot2cluster-analysisaestheticsggraph

ggraph can't select columns for geom_node or geom_node_text aes variables


When I try using variables for geom_node_point or geom_node_text I keep getting errors, no matter what I do. Here's what I was trying at first.

library(igraph)
library(ggraph)
library(tidyverse)
graph3 <- graph_from_data_frame(test, directed = F)
ggraph(graph2, layout = "kk") + 
  geom_edge_link(aes(colour = match, alpha = match)) +  geom_node_point(aes(colour = qcluster, alpha =match)) +geom_node_text(repel=TRUE)#  theme(legend.position = "none")

but this gives me the following error

Error in FUN(X[[i]], ...) : object 'qcluster' not found

So then if I try using more precise instructions to show which columns to use, I get other problems

ggraph(graph3, layout = "kk") + 
geom_edge_link(aes(colour = match, alpha = match)) +  geom_node_point(aes(colour = test$qcluster, alpha =match)) +geom_node_text(repel=TRUE)#  theme(legend.position = "none")

returns the error

Don't know how to automatically pick scale for object of type function. Defaulting to continuous. Error: Aesthetics must be valid data columns. Problematic aesthetic(s): alpha = match. Did you mistype the name of a data column or forget to add after_stat()?

So my questions is, how can I get the columns to work as Aes variables, they seem to work just fine in the geom_edge_link, so I'm really confused on what exactly is going wrong. Especially since manually adding aes variables like strings or numbers seems to work just fine. The test dataset, it's a very small subsection of my data so testing goes easier

query subject qcluster scluster match bit_sum
16T1_lib200772_4_13 YP_009910784.1 cr165 cr209 False 57.4
16T1_lib200772_4_17 YP_009910789.1 cr177 cr241 False 131.0
16T1_lib200772_4_17 YP_009910790.1 cr177 cr230 False 57.4
16T1_lib200772_4_23 YP_009910794.1 cr7 cr7 True 69.3
16T1_lib200772_4_82 YP_009910759.1 cr1 cr1 True 92.8
16T1_lib200772_4_83 YP_009910760.1 cr6 cr6 True 79.3

Solution

  • I'm also quite new to ggraph - however, what I found out, that when you create a graph with graph_from_data_frame (or as_tbl_graph when using tidygraph) then only one attribute is created for vertices and all other attributes end up as edge attributes which cannot be accessed by geom_node_.... So, my workaround is to add vertex attributes manually. The second error message you get is due to the fact that match() is a function in R. Here is an example that shows how it could work. In this case I was super lazy and added vertex attributes rather arbitrarily - you need to pick the right attribute for your vertices:

    test = data.frame(
            query = c("16T1_lib200772_4_13", "16T1_lib200772_4_17",
                    "16T1_lib200772_4_17",  "16T1_lib200772_4_23",
                    "16T1_lib200772_4_82",  "16T1_lib200772_4_83"),
            subject = c("YP_009910784.1",   "YP_009910789.1",
                    "YP_009910790.1",   "YP_009910794.1",
                    "YP_009910759.1",   "YP_009910760.1"),
            qcluster = c("cr165",   "cr177",
                    "cr177",    "cr7",
                    "cr1",  "cr6"),
            scluster = c("cr209",   "cr241",
                    "cr230",    "cr7",
                    "cr1",  "cr6"),
            my_match = c(FALSE, FALSE,
                    FALSE,  TRUE,
                    TRUE,   TRUE),
            bitsum = c(57.4,    131.0,
                    57.4,   69.3,
                    92.8,   79.3))
    
    library(igraph)
    library(ggraph)
    library(tidyverse)
    #library(tidygraph)
    graph3 <- graph_from_data_frame(test, directed = F)
    
    #########################################################
    #### quick and dirty addition of a vertex attribute: ####
    graph3 <- set_vertex_attr(graph3, "qcluster", value = c("cr165",
                    "cr177", "cr177", "cr7",
                    "cr1",  "cr6", "cr177", "cr177",
                    "cr7",  "cr1", "cr6"))
    #########################################################
    
    #graph3 <- as_tbl_graph(graph3) 
    
    ggraph(graph3, layout = "kk") + 
      geom_edge_link(aes(colour = my_match)) +
      geom_node_point(aes(colour = qcluster)) +
      geom_node_text(repel=TRUE)#  theme(legend.position = "none")