Search code examples
rggplot2rlangggrepel

using `rlang` for conditional labelling in `ggplot` using `ggrepel`


I am writing a custom function to create a scatterplot with labels attached to points. Here is a minimal rendition of the same.

# needed libraries
library(tidyverse)
library(ggplot2)
library(ggrepel)

# custom function
label_adder <- function(data, x, y, label.var) {
  # basic plot
  plot <-
    ggplot(data = data,
           mapping = aes(
             x = !!rlang::enquo(x),
             y = !!rlang::enquo(y)
           )) +
    geom_point() +
    geom_smooth(method = "lm")

  # adding label
  plot <-
    plot +
    geom_label_repel(mapping = aes(label = !!rlang::enquo(label.var)))

  return(plot)
}

# creating dataframe
mtcars_new <- mtcars %>%
  tibble::rownames_to_column(., var = "car") %>%
  tibble::as_data_frame(x = .)

# using the function
label_adder(
  data = mtcars_new,
  x = wt,
  y = mpg,
  label.var = car
)

Created on 2018-08-30 by the reprex package (v0.2.0.9000).

Question I can't seem to figure out how I can make the labeling conditional on values of x and y variables. For example, let's say the user wants not to display all points in the scatterplot but only the points with (examples):

wt > 5
wt < 4 & mpg < 20
wt > 4 | mpg > 25

etc.

What can I change in the code for geom_label_repel using rlang so that any condition the user provides (involving x and/or y) will be evaluated and only those labels will be displayed in the plot?


Solution

  • You could try something like this. Here I add an expression argument to your function, check if the expression is being used, then filter accordingly.

    library(tidyverse)
    library(ggplot2)
    library(ggrepel)
    
    # custom function
    label_adder <- function(data, x, y, label.var, exp = NULL) {
      param_list <- as.list(match.call())
    
      if("exp" %in% names(param_list)){
        plot <-
        ggplot(
               mapping = aes(
                 x = !!rlang::enquo(x),
                 y = !!rlang::enquo(y)
               )) +
        geom_point(data = data) +
        geom_smooth(data = data, method = "lm")+
        geom_label_repel(data = data %>% filter(!!rlang::enquo(exp)), 
                         mapping = aes(label = !!rlang::enquo(label.var)))
        return(plot)
      }
      else{
        plot <-
        ggplot(data = data,
               mapping = aes(
                 x = !!rlang::enquo(x),
                 y = !!rlang::enquo(y)
               )) +
        geom_point() +
        geom_smooth(method = "lm")+
        geom_label_repel(mapping = aes(label = !!rlang::enquo(label.var)))
    
      return(plot)
      }
    }
    
    # creating dataframe
    mtcars_new <- mtcars %>%
      tibble::rownames_to_column(., var = "car") %>%
      tibble::as_data_frame(x = .)
    
    # using the function
    label_adder(
      data = mtcars_new,
      x = wt,
      y = mpg,
      label.var = car
    )
    

    label_adder(
      data = mtcars_new,
      x = wt,
      y = mpg,
      label.var = car,
      exp = wt < 4 & mpg < 20
    )
    

    Created on 2018-08-30 by the reprex package (v0.2.0).

    Update

    label_adder <- function(data, x, y, label.var, exp = NULL) {
      param_list <- as.list(match.call())
    
      if("exp" %in% names(param_list)){
        my_exp <- rlang::enquo(exp)
      }
      else{
        a <- "row_number() > 0"
        my_exp <- rlang::quo(!! rlang::sym(a))
      }
    
      plot <-
        ggplot(
               mapping = aes(
                 x = !!rlang::enquo(x),
                 y = !!rlang::enquo(y)
               )) +
        geom_point(data = data) +
        geom_smooth(data = data, method = "lm")+
        geom_label_repel(data = data %>% filter(!!my_exp), 
                         mapping = aes(label = !!rlang::enquo(label.var)))
      return(plot)
    
    }
    

    This still uses an if and else, but does not require all the extra code like above.