Search code examples
rdatedplyrmaxlubridate

Find Min and Max date from multiple date columns using R


I would like to create a new column called maxdate and mindate from a list of date columns assuming 4 date columns and has missing values.

The solution below only gives the max/min of the rows of the columns. I am interested in finding max/min date across the columns.

df$maxdate <- apply (df[1:4], 1, max, na.rm = TRUE)
df <- data.frame(
  col1 = c("11/09/1999", "11/09/1999", "11/09/1999", "11/09/1999", "11/09/1999"),
  col2 = c("01/01/2000", "01/01/2000", "01/01/2000", "01/01/2000", "01/01/2000"),
  col3 = c("12/09/1961", "10/03/1995", "30/03/1992", "25/05/1992", "25/05/1992"),
  col4 = c("01/01/1930", "01/01/1939", "01/01/1942", "01/01/1936", "01/01/1937")
)

sample data

col1          col2        col3      col4

11/09/1999  01/01/2000  12/09/1961  01/01/1930
11/09/1999  01/01/2000  10/03/1995  01/01/1939
11/09/1999  01/01/2000  30/03/1992  01/01/1942
11/09/1999  01/01/2000  25/05/1992  01/01/1936
11/09/1999  01/01/2000  25/05/1992  01/01/1937

Solution

  • library(dplyr)
    
    df = data.frame(date1 = c("2023-05-11", "2023-04-12","2023-07-13","2023-01-14","2023-05-15"),
                    date2 = c("2023-04-11", "2023-07-12","2023-09-13","2023-05-14","2023-12-15"),
                    date3 = c("2023-08-11", "2023-06-12","2023-08-13","2023-08-14","2023-05-15"),
                    date4 = c("2023-01-11", "2023-05-12","2023-05-13","2023-12-14","2023-05-15"))
    
    df <- df  %>% mutate_all(as.Date)
    
    # edit: removed rowwise and added na.rm=TRUE, as you seem to want the max from all rows, disregarding NAs?
    df <- df %>%  mutate(max_date = max(date1, date2,date3,date4, na.rm=TRUE))
    
    df