I have a data that looks like this,
It can be build using codes:
df<-structure(list(Gender = c("M", "F", "M", "F", "F"), Location = c("Cleveland, OH",
"New Olreans, LA", "Chicago, IL", "Strongsville, OH", "Boston, MA"
)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"
))
I want to build variable" comment" as follow:
The rule is: if Gender=="F" and we find "OH" in Location, then comment ="Female in OH" if Gender=="F" and we can't find "OH" in Location, then comment ="Female in Other" if Gender=="M" and we find "OH" in Location, then comment ="Male in OH" if Gender=="M" and we can't find "OH" in Location, then comment ="Male in Other"
So my codes are
df<-df %>%
mutate(Comment = case_when(Gender=="F" & grep("OH", df$Location)~"Female in OH",
Gender=="F" & !grep("OH", df$Location)~ "Female in Other",
Gender=="M" & grep("OH", df$Location2)~ "Male in OH",
Gender=="M" & !grep("OH", df$Location)~ "Male in other)",
TRUE~NA))
It won't work. Could anyone give me some guidance on this?
Use grepl
rather than grep
to get boolean TRUE/FALSE values rather than the indexes. For example (as well as fixing other typos)
df %>%
mutate(Comment = case_when(Gender=="F" & grepl("OH", Location)~"Female in OH",
Gender=="F" & !grepl("OH", Location)~ "Female in Other",
Gender=="M" & grepl("OH", Location)~ "Male in OH",
Gender=="M" & !grepl("OH", Location)~ "Male in other"))
I took out the NA part since you covered all the cases and NA is the default value when no other matches occur. But if you need it explicitly, then you should use the typed version of NA for characters.
df %>%
mutate(Comment = case_when(Gender=="F" & grepl("OH", Location)~"Female in OH",
Gender=="F" & !grepl("OH", Location)~ "Female in Other",
Gender=="M" & grepl("OH", Location)~ "Male in OH",
Gender=="M" & !grepl("OH", Location)~ "Male in other",
TRUE~NA_character_))