I want to create a variable in a data frame, that would categorize the observations based on a column's Quartile/Median value.
Below is what I tried.
Name<-c("name1","name2","name3","name4","name5","name6")
Age<-c(49,12,29,55,25,19)
df9<-data.frame(Name,Age)
df9$catoG[df9$Age<=quantile(df9$Age,0.25)]<-"Young"
df9$catoG[df9$Age>quantile(df9$Age,0.25) & df9$Age<=median(df9$Age)]<-"Adult"
df9$catoG[df9$Age>median(df9$Age)]<-"Elder"
The output I received is
Name Age catoG
1 name1 49 Elder
2 name2 12 Young
3 name3 29 Elder
4 name4 55 Elder
5 name5 25 Adult
6 name6 19 Young
Is there a more efficient way in R that I can achieve the same?
cut
is your friend for all tasks involving splitting vectors in ranges:
df9$new = cut(df9$Age,
breaks = c(-Inf, quantile(df9$Age,c(0.25, 0.5)), Inf),
labels = c('Young', 'Adult', 'Elder') )
# Name Age catoG new
#1 name1 49 Elder Elder
#2 name2 12 Young Young
#3 name3 29 Elder Elder
#4 name4 55 Elder Elder
#5 name5 25 Adult Adult
#6 name6 19 Young Young