Search code examples
rcoordinatesspatialr-spmap-projections

Converting latitude and longitude data to UTM with points from multiple UTM zones in R


This question is an extension of the one asked on this thread: Converting latitude and longitude points to UTM. I'm not sure if it should be a reply on there, but am asking it here for now.

I have a dataset containing biological records with lat/long co-ordinates from a number of UTM zones and various attributes (dates, recorders, quantity etc). I would like to convert these to UTM coordinates. I have found ways of converting lat/long to UTM when the points are all in the same UTM zone, but am struggling to find an effective solution when they are all in different zones. The only question asking something similar I can find is this one: Projecting long/lat with multiple UTM zones, where the OP is advised to do something completely different instead.

I have tried the answers in the (first) linked thread, where there is a function for finding the UTM Zone of long/lat co-ordinates, and a function for converting long/lat coordinates to UTM based on a specified zone.

find_UTM_zone <- function(longitude, latitude) {

 # Special zones for Svalbard and Norway
    if (latitude >= 72.0 && latitude < 84.0 ) 
        if (longitude >= 0.0  && longitude <  9.0) 
              return(31);
    if (longitude >= 9.0  && longitude < 21.0)
          return(33)
    if (longitude >= 21.0 && longitude < 33.0)
          return(35)
    if (longitude >= 33.0 && longitude < 42.0) 
          return(37)

    (floor((longitude + 180) / 6) %% 60) + 1
}

LongLatToUTM <-
  function(x, y, zone) {    
    xy <- data.frame(X = x, Y = y, zone = zone) #Makes a dataframe with a column x coordinates, y coordinates and UTM zone.
    coordinates(xy) <- c("X", "Y") #generates coordinates in that dataframe
    proj4string(xy) <- CRS("+proj=longlat +datum=WGS84")  #sets projection (it's currently standard lat long)
    res <- spTransform(xy, CRS(paste("+proj=utm +zone=",zone," ellps=WGS84",sep='')))  #changes the projection to UTM, based on the zone in the column.
    return(res)
  }

These look to me like they should work fine on columns of lat/long coordinates, but instead the first one just returns the UTM zone of the first coordinate in every row, and the second one converts but to the UTM zone of the first coordinate (so only this one and any subsequent coordinates that happen to be in the same zone are correct). Clearly I don't understand these tools well enough...

Does anyone have any suggestions for how to stop this happening? I feel like this is probably more of a question about syntax and dataframes than dealing with spatial data, and a simple adjustment to how these functions are written would solve things.

Some example data here with the lat (x) and long (y) coordinates, the 'correct' UTM equivalents (x_correct and y_correct), their correct UTM zone, and the country they are in.


test_coordinates <- data.frame(x = c(13.637398, -3.58627, -5.178889), y = c(41.30736, 40.72913, 40.17528), x_correct = c(385936, 450492, 314480), y_correct = c(4573773, 4508854, 4449488 ),  zone = c(33, 30, 30), key = c(1, 2, 3), country = c("italy", "spain", "spain"))

My solution was to run the UTM zone finder over the data frame using mapply, and write a new version of the converter that takes each point in turn and run this with mapply as well.

LongLatToUTM2 <-
  function(x, y, zone) {
    pt <- st_sfc(st_point(c(x, y), dim = XY)) #Creates a point out of x and y coordinates
    st_crs(pt) <- "+proj=longlat +datum=WGS84" #Sets the CRS as standard lat long for this point
    pt2<- st_transform(pt, CRS(paste("+proj=utm +zone=",zone," ellps=WGS84",sep=''))) #Converts this to UTM for the zone specified in the function
    return(pt2) #returns the new UTM point
  }

mapply(find_UTM_zone, test_coordinates$x, test_coordinates$y) -> zones2 #Creates a vector of the correct UTM zones

mapply(LongLatToUTM2, test_coordinates$x, test_coordinates$y, test_coordinates$zone) -> converted_points #Makes a list of the correct UTM points

data.frame(t(sapply(converted_points,c)))  #Makes this list into a dataframe

The problems with this are it is very slow (I have 60,000 records to convert...) and also you end up with a dataframe of UTM co-ordinates that you can't really match to the original co-ordinates (I'm not sure if you can actually add an attribute to a single point in R?). Ideally I'd like to be able to keep the attributes with the converted coordinates.

EDIT: Just to clarify what I'm after - I'd like to improve the functions so that they work on dataframes, rather than single coordinates, i.e. with an input of a dataframe of longitudes and latitudes, to which I can add a column containing their UTM zones, and then another column with the correct UTM coordinates, based on these zones.

I've realised it's fairly easy to get the UTM zones using a dplyr method (if I just ignore Svalbard/Norway coordinates for now):

test_coordinates %>% 
  mutate(zone2 = (floor((x + 180)/6) %% 60) + 1)

But part two, how to easily write a function that gets UTM coordinates using these zones is still a mystery to me.


Solution

  • UPDATE

    Here is a much faster and elegant workaround using dplyr and spTransfrom

    Augmented data (60k+ rows):

    test_coordinates <- data.frame(x = c(13.637398, -3.58627, -5.178889), y = c(41.30736, 40.72913, 40.17528), x_correct = c(385936, 450492, 314480), y_correct = c(4573773, 4508854, 4449488 ),  zone = c(33, 30, 30), key = c(1, 2, 3), country = c("italy", "spain", "spain"))
    test_coordinates = rbind(test_coordinates, test_coordinates[rep(1,60*1000),]) # simulate big data
    
    library(dplyr)
    library(sp)
    
    get_utm <- function(x, y, zone, loc){
      points = SpatialPoints(cbind(x, y), proj4string = CRS("+proj=longlat +datum=WGS84"))
      points_utm = spTransform(points, CRS(paste0("+proj=utm +zone=",zone[1]," +ellps=WGS84")))
      if (loc == "x") {
        return(coordinates(points_utm)[,1])
      } else if (loc == "y") {
        return(coordinates(points_utm)[,2])
      }
    }
    
    test_coordinates %<>% 
      mutate(zone2 = (floor((x + 180)/6) %% 60) + 1, keep = "all"
      ) %>% 
      group_by(zone2) %>% 
      mutate(utm_x = get_utm(x, y, zone2, loc = "x"),
             utm_y = get_utm(x, y, zone2, loc = "y"))
    

    Output (5 rows only)

    test_coordinates
    
    
    # A tibble: 603 × 10
    # Groups:   zone2 [2]
           x     y x_correct y_correct  zone   key country zone2   utm_x    utm_y
       <dbl> <dbl>     <dbl>     <dbl> <dbl> <dbl> <chr>   <dbl>   <dbl>    <dbl>
     1 13.6   41.3    385936   4573773    33     1 italy      33 385936. 4573773.
     2 -3.59  40.7    450492   4508854    30     2 spain      30 450492. 4508854.
     3 -5.18  40.2    314480   4449488    30     3 spain      30 314480. 4449488.
     4 13.6   41.3    385936   4573773    33     1 italy      33 385936. 4573773.
     5 13.6   41.3    385936   4573773    33     1 italy      33 385936. 4573773.
    

    ORIGINAL RESPONSE

    Replace:

    data.frame(t(sapply(converted_points,c)))  #Makes this list into a dataframe
    

    With:

    test_coordinates$utm_x <- unlist(converted_points)[c(T,F)]
    test_coordinates$utm_y <- unlist(converted_points)[c(F,T)]
    
              x        y x_correct y_correct zone key country    utm_x   utm_y
    1 13.637398 41.30736    385936   4573773   33   1   italy 385935.7 4573773
    2 -3.586270 40.72913    450492   4508854   30   2   spain 450492.4 4508854
    3 -5.178889 40.17528    314480   4449488   30   3   spain 314479.5 4449488