I am working with data similar to the data below:
ID <- c("A", "B", "C", "D", "E")
x1 <- c(1,1,1,1,0)
x2 <- c(0,0,1,2,2)
x3 <- c(0,0,0,0,0)
x4 <- c(0,0,0,0,0)
df <- data.frame(ID, x1, x2, x3, x4)
It looks like:
> df
ID x1 x2 x3 x4
1 A 1 0 0 0
2 B 1 0 0 0
3 C 1 1 0 0
4 D 1 2 0 0
5 E 0 2 0 0
I want to create a new column, which is the product of the conditional statement: if x1 == 1
and all the other columns are equal to 0
, then it is coded "Positive"
.
How can I reference all the other columns besides x1 without having to write out the rest of the columns in the conditional statement?
Base R:
df$new <- ifelse(df$x1==1 & ## check x1 condition
rowSums(df[,3:5]!=0)==0), ## add the logical outcomes by row
"Positive",
"not_Positive"))
The second line is a little tricky.
df[,3:5]
(or df[,-(1:2)]
) selects all the columns except the first two. You could also use subset(df,select=x2:x4)
here (although ?subset
says "Warning: This is a convenience function intended for use interactively ...")!=0
tests whether the values are 0 or not, returning TRUE
or FALSE
rowSums()
adds up the values (FALSE
→0, TRUE
→1)If there might be NA
values then you'll need an na.rm=TRUE
in your rowSums()
specification