I have a recordset with three columns for street address. The individuals completing the form sometimes assumes "Street Address 2" is for "City, State and Zip". I want to delete the entry in "Street Address 2" if it looks like that is what the individual did. I'm finding this surprisingly hard to do in R/Tidyverse given the simplicity of solutions in Excel. Here's an example:
df <- data.frame(address2=c("Tulsa, OK", "Apt. 1","Harbor Club Apartments"), city = c("Tulsa", "Tulsa", "Tulsa"))
In this sample DF, I expect that my code will set record 1:address2 equal to NA. I've tried several iterations of ifelse statements to no avail and it seems the most promising approach would be to use str_detect(), as the following:
df <- mutate(address2 = ifelse(str_detect(df$address2,df$city)),NA, address2)
Theoretically, this should set address2 to NA if "Tulsa" is found in an address2 record, or retain the address2 record if not. However, it gives me an error:
Error in UseMethod("mutate") : no applicable method for 'mutate' applied to an object of class "logical"
Any thoughts will be greatly appreciated for how to do this as well as why this will not work. Best - Steve
Your approach is good but there are some syntax issues you need to correct, e.g.
library(tidyverse)
df <- data.frame(address2=c("Tulsa, OK", "Apt. 1","Harbor Club Apartments"),
city = c("Tulsa", "Tulsa", "Tulsa"))
df
#> address2 city
#> 1 Tulsa, OK Tulsa
#> 2 Apt. 1 Tulsa
#> 3 Harbor Club Apartments Tulsa
df <- mutate(df, address2 = if_else(str_detect(address2, city), NA, address2))
df
#> address2 city
#> 1 <NA> Tulsa
#> 2 Apt. 1 Tulsa
#> 3 Harbor Club Apartments Tulsa
Created on 2023-10-11 with reprex v2.0.2