Search code examples
rdata-quality

R: Data Quality Check: Zip Code matching the City


Can someone help me to realize an idea in R?

I want to achieve, that when R gets an Input File with e.g. a list of companies and their address, it will check wether the zip Code fits to the City for each Company. I have a list of all cities and Zip codes from a certain Country. How can I implement the list into an if sentence?

Did someone Programm something similar before?

Thanks for ur help! Sandra


Solution

  • Just a quick example of what one could do. It is, however, probably better to use fuzzy matching for your cities.

    # City codes (all city codes can be found at https://www.allareacodes.com/)
    my_city_codes <- data.frame(code = c(201:206), 
                                cities = c("Jersey City, NJ", "District of Columbia", "Bridgeport, CT", "Manitoba", "Birmingham, AL", "Seattle, WA"),
                                stringsAsFactors = FALSE)
    
    # Function for checking if city/city-code matches those in the registries
    adress_checker <- function(adress, citycodes) {
      # Finding real city
      real_city <- my_city_codes$cities[which(adress$code == my_city_codes$code)]
    
      # Checking if cities are the same
      if(real_city == adress$city) {
        return("Correct city")
      } else {
        return("Incorrect city")
      }
    }
    
    # Adresses to check
    right_city <- data.frame(code = 205, city = c("Birmingham, AL"), stringsAsFactors = FALSE)
    wrong_city <- data.frame(code = 205, city = c("Las Vegas"), stringsAsFactors = FALSE)
    
    # Testing function
    adress_checker(right_city, my_city_codes)
    [1] "Correct city"
    adress_checker(wrong_city, my_city_codes)
    [1] "Incorrect city"