I have a data frame with information like this:
df <- data.frame(Col1 = c("value1", "value1", "value2", "value2"), Col2 = c("value2", "value1", "value2", "value1"), stringsAsFactors = F)
+--------+--------+
| Col1 | Col2 |
+--------+--------+
| value1 | value2 |
| value1 | value1 |
| value2 | value2 |
| value2 | value1 |
+--------+--------+
I want to create a third column that has color information depending on if the values of the first two columns are the same or not. Right now my script is this:
for (i in 1:nrow(df)) {
if(df[i,1] == df[i,2]) {
df$color[i] <- "black"
} else {
df$color[i] <- "grey"
}
}
which gives me the following output:
+--------+--------+-------+
| Col1 | Col2 | color |
+--------+--------+-------+
| value1 | value2 | grey |
| value1 | value1 | black |
| value2 | value2 | black |
| value2 | value1 | grey |
+--------+--------+-------+
This is the correct output, but i would like to know if there is a more "R" method of doing this. I have large sets of data and for-loops are not very fast to use here.
I think you could do this in one using ifelse()
df$Col3 = ifelse(df$Col1 == df$Col2, "black", "grey")