Search code examples
rggplot2bins

How to rename the bins in ggplot in R


so basically I have created the bins and the have the means of each bin, having these two columns in a dataframe. Now I am plotting these two columns, but I want the exact number as x lable instead of bins. I am considering renaming each bin by its mid-point. please look at the pictures. The first one is my current plot and the second is the plot I want to acheive.

my current plot: enter image description here what I want to have: enter image description here my data frame is like this: enter image description here


Solution

  • To reproduce the style of the plot image you included, you can do:

    library(tidyverse)
    
    df %>%
      mutate(bin_group = gsub("\\(|\\]", "", bin_group)) %>%
      separate(bin_group, sep = ",", into = c("lower", "upper")) %>%
      mutate(across(lower:upper, as.numeric)) %>%
      mutate(`Birth weight (g)` = (upper + lower) / 2) %>%
      ggplot(aes(`Birth weight (g)`, mean_28_day_mortality)) +
      geom_vline(xintercept = 1500) +
      geom_point(shape = 18, size = 4) +
      scale_x_continuous(labels = scales::comma) +
      labs(title = "One-year mortality", y = NULL) +
      theme_bw(base_family = "serif", base_size = 20) +
      theme(panel.grid.major.x = element_blank(),
            panel.grid.minor = element_blank(),
            panel.grid.major.y = element_line(color = "black", size = 0.5),
            plot.title = element_text(hjust = 0.5))
    

    enter image description here


    Edit

    To make the specific changes to the range use the limits argument in scale_x_continuous and scale_y_continuous you can do:

    library(tidyverse)
    
    df %>%
      mutate(bin_group = gsub("\\(|\\]", "", bin_group)) %>%
      separate(bin_group, sep = ",", into = c("lower", "upper")) %>%
      mutate(across(lower:upper, as.numeric)) %>%
      mutate(`Birth weight (g)` = (upper + lower) / 2) %>%
      ggplot(aes(`Birth weight (g)`, mean_28_day_mortality)) +
      geom_vline(xintercept = 1500) +
      geom_point(shape = 18, size = 4) +
      scale_x_continuous(labels = scales::comma, limits = c(1350, 1650),
                         breaks = seq(1350, 1650, 50)) +
      scale_y_continuous(limits = c(0, 0.1), name = NULL) +
      labs(title = "One-year mortality") +
      theme_bw(base_family = "serif", base_size = 20) +
      theme(panel.grid.major.x = element_blank(),
            panel.grid.minor = element_blank(),
            panel.grid.major.y = element_line(color = "black", size = 0.5),
            plot.title = element_text(hjust = 0.5))
    

    enter image description here

    Data used (obtained from image in question using OCR)

    df <- structure(list(bin_group = structure(1:10, 
            levels = c("(1.35e+03,1.38e+03]", 
            "(1.38e+03,1.41e+03]", "(1.41e+03,1.44e+03]", "(1.44e+03,1.47e+03]", 
            "(1.47e+03,1.5e+03]", "(1.5e+03,1.53e+03]", "(1.53e+03,1.56e+03]", 
            "(1.56e+03,1.59e+03]", "(1.59e+03,1.62e+03]", "(1.62e+03,1.65e+03]"
            ), class = "factor"), mean_28_day_mortality = c(0.0563498, 0.04886257, 
            0.04467626, 0.04256053, 0.04248667, 0.04009187, 0.03625538, 0.03455094, 
            0.03349542, 0.02892909)), class = c("tbl_df", "tbl", "data.frame"
            ), row.names = c(NA, -10L))