This is a simplification of my dataframe. The column with colors are characters.
|ID|Color |
|--|------|
|1 |Brown |
|2 |Black |
|3 |Red |
|4 |Blue |
|5 |Black |
|6 |Green |
|7 |Brown |
|8 |Red |
|9 |Yellow|
|10|Violet|
I would like to replace all colors that are NOT black, brown or red to Other. I have a piece of code that works.
library(tidyverse)
df_clean <- df %>%
mutate(Color = case_when(
str_detect(Color, "Red") ~ "Other",
str_detect(Color, "Blue") ~ "Other",
str_detect(Color, "Green") ~ "Other",
str_detect(Color, "Yellow") ~ "Other",
str_detect(Color, "Violet") ~ "Other",
TRUE ~ Color
))
But I would have to do this for all colors (my full dataset has more than 50 color names in >160000 data entries). Is there a simpler way to do this? Like maybe negate() or use ! in the code somewhere? Like say if its not black, brown or red change to Other?
You can replace the colors using %in%
df$Color[!df$Color %in% c('Black', 'Brown', 'Red')] <- 'Other'
Can also use fct_other
from forcats
.
library(dplyr)
library(forcats)
df %>% mutate(Color = fct_other(Color, c('Black', 'Brown', 'Red')))