Search code examples
rdata-cleaningcontains

Create a dummy variable for all rows that contain County in R


Given the following using R:

County_or_City <- c("Butte County", "Oroville", "Solano Cnty", "Redding", "Maripossa county")
data.frame(County_or_City)

    County_or_City
1     Butte County
2         Oroville
3      Solano Cnty
4          Redding
5 Maripossa county

I would like to create a new column with a dummy variable for rows that contain Cnty, County, or county. Sorry I know this is very basic, but I'm learning. What do I do???


Solution

  • Using base R

    transform(data.frame(County_or_City), 
     dummy = grepl('C(ou)?nty', County_or_City, ignore.case = TRUE))
    

    -output

       County_or_City dummy
    1     Butte County  TRUE
    2         Oroville FALSE
    3      Solano Cnty  TRUE
    4          Redding FALSE
    5 Maripossa county  TRUE