Search code examples
rggplot2dplyr

Passing a dataframe variable to a ggplot2 function


Problem:

  • Cannot parse a df variable passed to ggplot2 in scale_y_continous

Aim:

  • Perform a secondary axis with proportion dynamically.

My workaround are currently writing the sum manually which is not the best way, or save the df in a variable and access it using df$count

Inquiry:

  • How can I access that df variable dynamically .$count?

Reproducable Example:

mtcars %>% 
  group_by(gear) %>% 
  summarise(count = n()) %>% 
  {
  ggplot(data = . , aes(x = gear, y = count)) +
  geom_col() +
  coord_flip() +
  
  scale_y_continuous(labels = comma_format(),
                     sec.axis = sec_axis(~./32, #sum(.$count), 
                                         labels = scales::percent,
                                         name = "proportion")
                     ) 
  }
  

Reference: How do I access the data frame that has been passed to ggplot()?


Solution

  • The confusion is because the transform argument to sec_axis() takes:

    A formula or function of a strictly monotonic transformation

    You are using formula notation with ~ . / 32. This means the . refers to the axis values, which masks the . which generally refers to the object that you have piped with %>%.

    The easiest way to avoid this is to use an anonymous function instead of formula notation:

    mtcars %>%
        group_by(gear) %>%
        summarise(count = n()) %>%
        {
            ggplot(data = ., aes(x = gear, y = count)) +
                geom_col() +
                coord_flip() +
                scale_y_continuous(
                    labels = comma_format(),
                    sec.axis = sec_axis(
                        \(x) x / sum(.$count), # this is the only changed line
                        labels = scales::percent,
                        name = "proportion"
                    )
                )
        }
    

    enter image description here