Is there a smarter way than the following to do a simple arithmetic operation to a data frame column by a factor level?
data <- runif(100,0,1)
df <- data.frame(x = data,
class = cut(data, breaks = c(0,0.5,1), labels = c("low", "high")))
df$x2 <- ifelse(df$class == "high", df$x - 1, df$x + 1)
I have a data frame with several factor levels and would like to add / multiply the values with a vector of different values. I though maybe something with split
could work?
Let's make use of the internal integer representation of a factor:
df$x2 <- with(df, c(1, -1)[class] + x)
I don't recommend using df
and class
as variable names however, as they are aliased to R base functions. (Don't use data
for the same reason.)
Some explanation here. You have coded class
with factor levels "low" and "high", so they map to 1 and 2. Try as.integer(df$class)
to see this. Now, your code suggest you want to add 1 to x
for "low" and subtract 1 from x
for "high", so we dispatch the increment vector c(1, -1)
according to factor levels, then add it to x
.