I got 2 identical variables due to allowing multiple responses.
Let's say, variables deal about hobbies: 1 = football, 2 = ice hockey, 3 = I have no hobbies
Thus, one can have two hobbies: football PLUS ice hockey.
hobby1<-c(1,2,3)
hobby1<-factor(hobby1,labels("football", "ice hockey", "I have no hobbies")
hobby2<-c(1,2,3)
hobby2<-factor(hobby2,labels("football", "ice hockey", "I have no hobbies")
Now I try to extract amout of hobbies, reaching from 0 to 2.
I already tried:
sum(hobby1<2, hobby2<2)
How can this be done, sum-function is not working for factors? Plus, my solution would not take into account 3th category: no hobbies.
Should I possibly change my data arrangement, e.g. dummy coding (football yes/no, ...).
Dummy coding could be an easier approach since once you transform the data into a factor you can't use sum
or the <
operations easily. This approach works in base R:
df <- data.frame(football = c(0, 1, 1, 0),
ice_hockey = c( 1, 1, 0, 0))
df$num_hobbies <- rowSums(df[, 1:2])
df
# football ice_hockey num_hobbies
# 0 1 1
# 1 1 2
# 1 0 1
# 0 0 0
Or using dplyr
to take advantage of column names a little more easily:
library(dplyr)
df <- data.frame(football = c(0, 1, 1, 0),
ice_hockey = c( 1, 1, 0, 0)) %>%
mutate(num_hobbies = football + ice_hockey)
df
# football ice_hockey num_hobbies
# 0 1 1
# 1 1 2
# 1 0 1
# 0 0 0