Search code examples
rdplyrlubridate

Re_order based on month and date while ignoring year in R


Lubridate doesn't have a month-date col type it appears and I would like to make a histogram across several years that focuses just on the month and day while ignoring the year. But when it comes time to sort it will think 1-10 comes after 1-1 and so on. I'm sure there's an easy fix to this but I can't seem to find it.

Here's some sample code if necessary:

structure(list(date = structure(c(19512, 13135, 14242), class = "Date"), 
    sport = c("MLB", "NFL", "NHL")), row.names = c(NA, -3L), class = c("tbl_df", 
"tbl", "data.frame"))

Solution

  • You can arrange using month and date functions:

    library(tidyverse)
    
    df = structure(list(date = structure(c(19512, 13135, 14242), class = "Date"), 
                        sport = c("MLB", "NFL", "NHL")), row.names = c(NA, -3L), 
                   class = c("tbl_df", "tbl", "data.frame"))
    
    df %>% arrange(month(date), day(date))
    #> # A tibble: 3 × 2
    #>   date       sport
    #>   <date>     <chr>
    #> 1 2023-06-04 MLB  
    #> 2 2005-12-18 NFL  
    #> 3 2008-12-29 NHL
    

    Created on 2024-07-17 with reprex v2.0.2

    UPDATE

    To arrange the a plot by month and day without considering year you can make another date with the same year and use month and day only to label the breaks.

    library(tidyverse)
    
    df = structure(list(date = structure(c(19512, 13135, 14242), class = "Date"), 
                        sport = c("MLB", "NFL", "NHL")), row.names = c(NA, -3L), 
                   class = c("tbl_df", "tbl", "data.frame"))
    
    df$dateMD = df$date
    year(df$dateMD) = 2001
    df
    #> # A tibble: 3 × 3
    #>   date       sport dateMD    
    #>   <date>     <chr> <date>    
    #> 1 2023-06-04 MLB   2001-06-04
    #> 2 2005-12-18 NFL   2001-12-18
    #> 3 2008-12-29 NHL   2001-12-29
    
    ggplot(df, aes(dateMD, 1:3)) +
      geom_col() +
      scale_x_date(name = "date", date_labels = "%b-%d", date_break = "1 month")
    

    Created on 2024-07-17 with reprex v2.0.2

    UPDATE 2

    If you want to change the year within dplyr chain check this question.