Search code examples
rggplot2time-seriesx-axis

R: Remove range x axis without data (NA) in a graph


I am trying to remove a range of x-axis from a ggplot. My data x represents years and weeks:

202045: year 2020 week 45

202053: last week in 2020 (any year has 52-53 weeks, no more...)

 summary(df$year_week)

Min. 1st Qu. Median Mean 3rd Qu. Max. 202045 202047 202050 202054 202052 202101

Lamentably my data "jump" from last week in 2020 until first week in 2021, and display x-axis with "ghost" weeks, example:

year_week=rep(c(202045,202046,202047,202048,202049,202050,202051,202052,202053,202101),times=1)
cases=rnorm(200, 44, 33)
df=data.frame(year_week, cases)

ggplot(df, aes(x=year_week, y=cases))+
geom_line()+
theme(axis.text.x = element_text(angle = 45,  
    hjust = 0.85, size=9))+
scale_x_continuous(limits=c(202045, 202101))

graph1

I tried to remove with NA, but the results is the same

df$year_week[df$year_week>202053 & df$year_week<202101]= NA
df$cases[df$year_week>202053 & df$year_week<202101]= NA

ggplot(na.omit(df), aes(x=year_week, y=cases))+
geom_line()+
theme(axis.text.x = element_text(angle = 45,  
    hjust = 0.85, size=9))+
scale_x_continuous(limits=c(202045, 202101))

df %>%
filter(!is.na(cases)) %>%
ggplot(aes(x=year_week, y=cases))+
geom_line()+
theme(axis.text.x = element_text(angle = 45,  
    hjust = 0.85, size=9))+
scale_x_continuous(limits=c(202045, 202101))

My expected graph is: (there is not exist week 60 or 80 at any year)

Graph expected


Solution

  • The issue is that your year_week variable is a numeric. However, as the weeks stop at 52 (or 53), e.g. 202052 you get a gap of 48 = 202101 - 202052 - 1 weeks before the first week of the next year starts. You could prevent that by converting your year_week variable to a character using as.character. Or you could do some formatting, e.g. split the year and week and add a hyphen, space, ... in between like I do in my code:

    Note: When converting to a character you have to make use of the group aes.

    year_week=rep(c(202045,202046,202047,202048,202049,202050,202051,202052,202053,202101),times=1)
    cases=rnorm(200, 44, 33)
    df=data.frame(year_week, cases)
    
    df$year_week <- paste(substr(df$year_week, 1, 4), substr(df$year_week, 5, 6), sep = "-")
    
    library(ggplot2)
    ggplot(df, aes(x=year_week, y=cases, group = 1))+
      geom_line()+
      theme(axis.text.x = element_text(angle = 45,  
                                       hjust = 0.85, size=9))