Search code examples
rregexleafletstringrgeocode

Cleaning Geocode Data


I've got a df just like this:

df = data.frame(longitude = c('-235.969', 
                       '-23.596.244', 
                       '-2.359.186'))

It´s an example of one column of geocode I'm trying to convert to something like this

new_df = data.frame(longitude = c('-23.5969', '-23.596244', '-23.59186'))

The main purpose is to use the geocode in an leaflet application.


Solution

  • If really necessary, I would do this in two steps:

    library(magrittr)
    gsub(".", "", df$longitude, fixed = TRUE) %>%
      sub("(\\d{2})", "\\1\\.", .)
    
    [1] "-23.5969"   "-23.596244" "-23.59186" 
    

    First drop any . then replace the first two digits with the first two digits + .

    PS. without pipes you could do:

    sub("(\\d{2})", "\\1\\.", gsub(".", "", df$longitude, fixed = TRUE))
    

    EDIT: Important caveat:

    As Matt points out this only works if your longitude ALWAYS consist of if your longitude degree is two digits (10-99).