Search code examples
rggplot2scalegeom-bar

X limits with continuous character values in R ggplot


I am creating a bar graph with continuous x-labels of 'Fiscal Years', such as "2009/10", "2010/11", etc. I have a column in my dataset with a specific Fiscal Year that I would like the x-labels to begin at (see example image below). Then, I would like the x-labels to be every continuous Fiscal Year until the present. The last x-label should be "2018/19". When I try to set the limits with scale_x_continuous, I receive an error of Error: Discrete value supplied to continuous scale. However, if I use 'scale_x_discrete', I get a graph with only two bars: my chosen "Start" date and the "End" of 2018/19.

Start<-Project_x$Start[c(1)]
End<-"2018/2019"

ggplot(Project_x, (aes(x=`FY`, y=Amount)), na.rm=TRUE)+
geom_bar(stat="identity", position="stack")+
scale_x_continuous(limits = c(Start,End))

` Error: Discrete value supplied to continuous scale `

Thank you.

My data is:

df <- data.frame(Project = c(5, 6, 5, 5, 9, 5), 
             FY = c("2010/11","2017/18","2012/13","2011/12","2003/04","2000/01"),
             Start=c("2010/11", "2011/12", "2010/11", "2010/11", "2001/02", "2010/11"),
             Amount = c(500,502,788,100,78,NA))

To use the code in the answer below, I need to base my Start_Year off of my Start column rather than the FY column, and the graph should just be for Project #5.

as.tibble(df) %>% 
mutate(Start_Year = as.numeric(sub("/\\d{2}","",Start)))
xlabel_start<-subset(df$Start_Year, Project == 5)
xlabel_end<-2018
filter(between(Start_Year,xlabel_start,xlabel_end)) %>%
  ggplot(aes(x = FY, y = Amount))+
  geom_col()

When running this, my xlabel_start is NULL.

enter image description here


Solution

  • In ggplot, continuous is dedicated for numerical values. Here, your fiscal year are character (or factor) format and so they are considered as discrete values and are sorted alphabetically by ggplot2.

    One possible solution to get your expected plot is to create a new variable containing the starting year of the fiscal year and filter for values between 2010 and 2018.

    But first, we are going to isolate the project and the starting year of interest by creating a new dataframe:

    library(dplyr)
    
    xlabel_start <- as.tibble(df) %>% 
      mutate(Start_Year = as.numeric(sub("/\\d{2}","",Start))) %>%
      distinct(Project, Start_Year) %>%
      filter(Project == 5)
    
    # A tibble: 1 x 2
      Project Start_Year
        <dbl>      <dbl>
    1       5       2010
    

    Now, using almost the same pipeline, we can isolate values of interest by doing:

    library(tidyverse)
    
    as.tibble(df) %>% 
      mutate(Year = as.numeric(sub("/\\d{2}","",FY))) %>%
      filter(Project == 5 & between(Year,xlabel_start$Start_Year,xlabel_end))
    
    # A tibble: 3 x 5
      Project FY      Start   Amount  Year
        <dbl> <fct>   <fct>    <dbl> <dbl>
    1       5 2010/11 2010/11    500  2010
    2       5 2012/13 2010/11    788  2012
    3       5 2011/12 2010/11    100  2011
    

    And once you have done this, you can simply add the ggplot plotting part at the end of this pipe sequence:

    library(tidyverse)
    
    as.tibble(df) %>% 
      mutate(Year = as.numeric(sub("/\\d{2}","",FY))) %>%
      filter(Project == 5 & between(Year,xlabel_start$Start_Year,xlabel_end)) #%>%
      ggplot(aes(x = FY, y = Amount))+
      geom_col()
    

    enter image description here

    Does it answer your question ?