I am attempting to find the difference between two dates and then group that value into factor levels. I have done this before with other numeric values but not dates and can't figure what I am doing incorrectly. I don't get any errors on the function creation but have tried two different ways to apply it.
I originally calculated it in days because I need a day value later on. Grouping it into weeks is to create some levels for visualization later.
#created Lead_time column to calculate how far in advance appointment was booked
#formated in days
df7$Lead_Time <- difftime(df7$Appointment_Date_Time, df7$appt_create_date, units = "days")
#to negate when negatives are created due to the appointment being created after the start time
df7$Lead_Time <- as.integer(df7$Lead_Time)
#group Lead_Time by weeks
group_Lead_Time <- function(Lead_Time){
if (Lead_Time <= 28){
return('0-4 Weeks')
}else if(Lead_Time > 29 & Lead_Time <= 56){
return('5-8 Weeks')
}else if (Lead_Time > 57 & Lead_Time <= 84){
return('8-12 Weeks')
}else if (Lead_Time > 85 & Lead_Time <= 112){
return('12-16 Weeks')
}else if (Lead_Time > 113 & Lead_Time <=140){
return('16-20 Weeks')
}else if (Lead_Time > 141 & Lead_Time <=168){
return('20-24 Weeks')
}else if (Lead_Time > 168){
return('24+ Weeks')
}
}
df7$Lead_Time_Grouped <- as.factor(group_Lead_Time(df7$Lead_Time))
df7$Lead_Time_Grouped <- sapply(df7$Lead_Time,group_Lead_Time)
If someone has a better way to handle the negative values I am open to it as well. These are the error messages I get:
> df7$Lead_Time_Grouped <- as.factor(group_Lead_Time(df7$Lead_Time))
Warning messages:
1: In if (Lead_Time <= 28) { :
the condition has length > 1 and only the first element will be used
2: In if (Lead_Time > 29 & Lead_Time <= 56) { :
the condition has length > 1 and only the first element will be used
3: In if (Lead_Time > 57 & Lead_Time <= 84) { :
the condition has length > 1 and only the first element will be used
4: In if (Lead_Time > 85 & Lead_Time <= 112) { :
the condition has length > 1 and only the first element will be used
> df7$Lead_Time_Grouped <- sapply(df7$Lead_Time,group_Lead_Time)
Error in if (Lead_Time <= 28) { : missing value where TRUE/FALSE needed
UPDATE/EDIT: Thanks for pointing me in the direction of ifelse. Was able to resolve my challenge with the code below.
#group Lead_Time by weeks
group_Lead_Time <- function(appt_lead_time){
ifelse (appt_lead_time <= 28,'0-4 Weeks',
ifelse (appt_lead_time > 29 & appt_lead_time <= 56, '5-8 Weeks',
ifelse (appt_lead_time > 57 & appt_lead_time <= 84, '8-12 Weeks',
ifelse (appt_lead_time > 85 & appt_lead_time <= 112, '12-16 Weeks',
ifelse (appt_lead_time > 113 & appt_lead_time <=140, '16-20 Weeks',
ifelse (appt_lead_time > 141 & appt_lead_time <=168, '20-24 Weeks',
'24+ Weeks'))))))
}
df7$appt_lead_time_weeks <- group_Lead_Time(df7$appt_lead_time)
With help from the comments I was able to come up with the solution below:
#group Lead_Time by weeks
group_Lead_Time <- function(appt_lead_time){
ifelse (appt_lead_time <= 28,'0-4 Weeks',
ifelse (appt_lead_time > 29 & appt_lead_time <= 56, '5-8 Weeks',
ifelse (appt_lead_time > 57 & appt_lead_time <= 84, '8-12 Weeks',
ifelse (appt_lead_time > 85 & appt_lead_time <= 112, '12-16 Weeks',
ifelse (appt_lead_time > 113 & appt_lead_time <=140, '16-20 Weeks',
ifelse (appt_lead_time > 141 & appt_lead_time <=168, '20-24 Weeks',
'24+ Weeks'))))))
}
df7$appt_lead_time_weeks <- group_Lead_Time(df7$appt_lead_time)