Search code examples
rparsingcoordinatescoordinate-systemscoordinate-transformation

Converting geographic coordinates


I have a CSV of places with geographic coordinates in degrees minutes seconds format but with no separators like this:

df <- data.frame(name = c("farm_1", "farm_2", "seabrook_1", "rocks_road"),
                 lat = c(425319.3, 425317, 425317.1, 425323.3), 
                 long = c(705045.5, 705101.1, 705145.4, 705219.8))


name          long       lat
farm_1        425319.3   705045.5 
farm_2        425317     705101.1
seabrook_1    425317.1   705145.4
rocks_road    425323.3   705219.8

I have another CSV of places with geographic coordinates in degrees minutes minutes format like this:

df_2 <- data.frame(name = c("exeter_road", "hampton_hill", "portsmouth_ave", "pebble_ln"), 
                   GPS_cordinates_DMM = c("N42 58.855 W70 56.473", "N42 58.666 W70 54.981", 
                                          "N42 56.579 W70 52.550", "N42 55.949 W70 53.631"))


name           GPS_cordinates_DMM
exeter_road    N42 58.855 W70 56.473
hampton_hill   N42 58.666 W70 54.981
portsmouth_ave N42 56.579 W70 52.550
pebble_ln      N42 55.949 W70 53.631

I would like to parse the coordinates in each data frame and convert them to decimal latitude and longitude. For example, the first data frame would look like this:

df_dec <- data.frame(name = c("farm_1", "farm_2", "seabrook_1", "rocks_road"), 
                 latitude = c(42.88869444,  42.88805556, 42.88808333, 42.88980556),
                 longitude = c(70.84597222, 70.85030556, 70.86261111, 70.87216667))
   name     latitude  longitude
  farm_1     42.88869  70.84597
  farm_2     42.88806  70.85031
  seabrook_1 42.88808  70.86261
  rocks_road 42.88981  70.87217

And the second data frame would look like this:

df_2_dec <- df_2 <- data.frame(name = c("exeter_road", "hampton_hill", "portsmouth_ave", "pebble_ln"), 
                           latitude = c(42.98091667, 42.97776667, 42.94298333, 42.93248333), 
                           longitude = c(70.94121667,   70.91635, 70.87583333, 70.89385))


name            latitude  longitude
exeter_road     42.98092  70.94122
hampton_hill    42.97777  70.91635
portsmouth_ave. 42.94298  70.87583
pebble_ln       42.93248  70.89385

Then I can eventually combine and map/analyze them.

Is there a package or fucntion that can parse and convert these coordinate types?

If not, how would you recommend writing one that is robust and can deal with issues such as no decimal in the latitude of the second row of the first dataset?


Solution

  • Using substr you may scrape the numeric values for degrees, minutes, and seconds out of the strings according to its position (substring soesn't need an ending position), turn them to numerics and calculate.

    f1 <- function(x) (as.numeric(substr(x, 1, 2))*60^2 + as.numeric(substr(x, 3, 4))*60 + 
                         as.numeric(substring(x, 5)))/60^2
    
    res1 <- data.frame(name=df$name, lapply(df[-1], f1))
    res1
    #         name      lat     long
    # 1     farm_1 42.88869 70.84597
    # 2     farm_2 42.88806 70.85031
    # 3 seabrook_1 42.88808 70.86261
    # 4 rocks_road 42.88981 70.87217
    

    The second specimen we may split at N, S, E, or W. using strsplit and basically do the same as with the first one.

    tmp <- as.data.frame(
      gsub("\\D", "", do.call(rbind, strsplit(df_2$GPS_cordinates_DMM, "[NSEW]"))[,-1]))
    f2 <- function(x) as.numeric(substr(x, 1, 2)) + 
      as.numeric(substring(x, 3))/1e3/60
    res2 <- data.frame(name=df_2$name, setNames(lapply(tmp, f2), c("lat", "lon")))
    res2
    #             name      lat      lon
    # 1    exeter_road 42.98092 70.94122
    # 2   hampton_hill 42.97777 70.91635
    # 3 portsmouth_ave 42.94298 70.87583
    # 4      pebble_ln 42.93248 70.89385