I'm pretty new to R and I can't figure my problem. I have been through countless forms.
My dataset looks like this: Glimpse of dataset
I want to find the first and last date of each Event and put it in a nice table. There are 26 events. However, the dates are in factor format which makes me unable to find the start and end date. And when I try to convert it to a numeric format I get NA for each value, and when I try to convert it to a date format it stays in factor format.
Can someone help me out?
As suggested I tried to find a way to share my dataset using dput. I tried it and I think this should work to get an 2x8 sample of my dataset.
df <- structure(list(`Release Date` = structure(c(1L, 3L, 4L, 5L, 6L, 7L, 8L, 9L), .Names = c("",
"", "", "", "",
"", "", ""), .Label = c("3/17/2020", "Release Date", "6/16/2020", "9/15/2020",
"12/16/2020", "12/17/2015", "6/17/2013", "9/17/2012", "6/14/2012",
"3/15/2012", "6/20/2011", "3/16/2011", "12/16/2010", "9/14/2010",
"6/16/2010", "3/17/2010", "12/15/2009", "9/15/2009", "6/16/2009",
"3/13/2009", "12/12/2008", "9/15/2008", "6/13/2008", "3/14/2008",
"12/13/2007", "9/12/2007", "6/14/2007", "3/15/2007", "12/14/2006",
"9/14/2006", "6/16/2006", "3/17/2006", "12/15/2005", "10/18/2005",
"9/21/2005", "7/15/2005", "6/21/2005", "4/15/2005", "3/15/2005",
"1/18/2005", "12/15/2004", "10/27/2004", "9/15/2004", "7/28/2004"
), class = "factor"), Event = structure(c(2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L), .Names = c("", "", "", "", "", "", "", ""), .Label = c("Event", "Labour Costs YoY",
"Unemployment Change (000's)", "Unemployment Rate", "Jobseekers Net Change"
), class = "factor")), row.names = c("X.1", "X.11", "X.12", "X.13", "X.14", "X.15", "X.16", "X.17"), class = "data.frame")
After converting the dates to date object you can use min
and max
to get first and last dates for each event.
library(dplyr)
df %>%
mutate(`Release Date` = as.Date(`Release Date`, '%m/%d/%Y')) %>%
group_by(Event) %>%
summarise(first_date = min(`Release Date`),
last_date = max(`Release Date`))
# Event first_date last_date
# <fct> <date> <date>
#1 Labour Costs YoY 2020-03-17 2020-12-16
#2 Unemployment Change (000's) 2012-06-14 2015-12-17