I need a function that takes two arguments:
For every give age the function should return corresponding age group.
I quickly came up with this function with two nested loops, and it works:
classifyAge <- function(ages, intervals) {
result <- character(length(ages))
for (i in seq_along(ages)) {
for (j in seq_along(intervals)) {
range <- as.numeric(strsplit(intervals[j], "-")[[1]])
if (ages[i] >= range[1] & ages[i] <= range[2]) {
result[i] <- intervals[j]
break
}
}
}
return(result)
}
result <- classifyAge(c(1, 2, 3, 5, 5, 7,0), c("1-2", "3-4", "5-Inf"))
print(result)
[1] "1-2" "1-2" "3-4" "5-Inf" "5-Inf" "5-Inf" ""
I was just wondering whether the same functionality could be achieved using vectorized functions somehow?
I am aware of "cut" function, but I did not have success with it.
cut
is recommended and preferred.
vec <- c(1, 2, 3, 5, 5, 7,0)
bins <- c(0, 2, 4, Inf)
cut(vec, bins, labels = paste(bins[-length(bins)]+1, bins[-1], sep="-"))
# [1] 1-2 1-2 3-4 5-Inf 5-Inf 5-Inf <NA>
# Levels: 1-2 3-4 5-Inf
If you need to determine bins
based on a user-provided string of integer-contiguous ranges, then perhaps
txt <- c("1-2", "3-4", "5-Inf")
bins <- c(0, as.numeric(sub(".*-", "", txt)))
bins
# [1] 0 2 4 Inf