Search code examples
rgeospatialspatialr-sp

How do I create (in r) a vector of distances between UTM locations based on group?


I have a data frame of individual animals located for different lengths of time. Each row identifies the individual (eg- T003, T121, etc.), the X and Y coordinates in UTMs, and the date the location was collected. I'm trying to calculate average daily distance moved for each individual to create a vector for comparison between individuals/populations. What's the best way to do this in r?

    TelemetryID     Date Easting Northing
1          T007  9/25/11  739632  3597373
2          T007  8/13/11  739637  3597367
3          T007  8/22/11  739641  3597375
4          T007  9/23/11  739637  3597388
5          T007  8/17/11  739639  3597409
6          T007   9/5/11  739623  3597379
7          T007  8/20/11  739635  3597385
8          T007   9/8/11  739668  3597369
9          T007  8/15/11  739633  3597384
10         T007   9/3/11  739632  3597377

I recognize that these are not consecutive dates, so it requires code function that will recognize calendar date relationships.

The end goal is a vector of average daily distance moved to add as a column to the following data frame

    TelemetryID         Area    Date Sex 
1          T001 6.643804e-11 8/10/11   M 
2          T002 5.940842e-12  8/7/11   M 
3          T003 1.389048e-10 8/10/11   M  
4          T004 8.175402e-12  8/7/11   M 
5          T005 4.928881e-11  8/9/11   M 
6          T006 2.697745e-11 8/10/11   M 
7          T007 1.168960e-10 8/10/11   F   

Input and Output tables are different because the input table includes every instance of location for an individual, which will by function be distilled to an average value that can be attributed to a single individual; the average value will be a dependent variable in modeling.

result <- SlimBoth %>%
  mutate(Date = as.Date(Date, format = "%m/%d/%y")) %>%
  arrange(Date) %>%
  group_by(TelemetryID) %>%
  mutate( Dist = pointDistance(cbind(Easting, Northing),
                               cbind(lag(Easting), lag(Northing)),
                               lonlat = FALSE),
          Elapsed = as.integer(Date - lag(Date)),
          DistPerDay = Dist / Elapsed)
result

result %>% 
  dplyr::summarise(AveDist = mean(DistPerDay, na.rm = TRUE)) %>%
  right_join(Telemetered.1)->ADDM

This function works great, and I updated the telemetered.1 data frame to include the column for Average Daily Distance Moved. The resultant table has a great deal of "Inf" entered where the mean movement values should be.

 TelemetryID AveDist Date    Easting Northing Sex   Translocated
   <chr>         <dbl> <chr>     <int>    <int> <chr> <chr>       
 1 T001          Inf   8/10/11  736408  3598539 M     No          
 2 T002          Inf   8/7/11   736529  3598485 M     No          
 3 T003          Inf   8/10/11  736431  3598671 M     No          
 4 T004          Inf   8/7/11   736535  3598673 M     No          
 5 T005          Inf   8/9/11   739641  3597415 M     No          
 6 T006           30.2 8/10/11  735846  3598974 M     No          
 7 T007          Inf   8/10/11  739647  3597146 F     No          
 8 T008          Inf   8/11/11  739797  3597455 M     No          
 9 T009          Inf   8/11/11  729166  3603726 F     No          
10 T010          Inf   8/11/11  729058  3603703 M     No    

The first df includes all of the instances of location for each individual. I want to summarize all these locations per individual with the value Average Daily Distance Moved (ADDM). This will yield 1 value/individual. I then want to add this value to another df for modeling that includes Individual (TelemetryID), sex, translocation status, ADDM, and Area of home range (which I've calculated separately for each individual). Here's data for an individual that was located twice on at least one day:

 TelemetryID    Date     Time Easting Northing Sex Translocated
4969        T237 8/14/13 10:36:00  740968  3597704   M           No
4970        T237  8/7/13 10:52:00  740860  3597865   M           No
4971        T237 8/13/13 09:49:00  740893  3597835   M           No
4972        T237 7/29/13 19:41:00  740872  3597872   M           No
4973        T237  8/6/13 10:36:00  741002  3597627   M           No
4974        T237 8/17/13 19:13:00  740965  3597710   M           No
4975        T237 8/18/13 19:25:00  740964  3597705   M           No
4976        T237  8/3/13 10:58:00  740860  3597865   M           No
4977        T237  8/5/13 09:20:00  740985  3597695   M           No
4978        T237 8/14/13 19:37:00  741005  3597644   M           No
4979        T237 7/30/13 10:03:00  740862  3597862   M           No
4980        T237 7/31/13 10:37:00  740874  3597862   M           No
4981        T237 8/20/13 18:56:00  740916  3597720   M           No
4982        T237 8/21/13 05:46:00  741025  3597736   M           No
4983        T237 8/27/13 10:07:00  740963  3597828   M           No
4984        T237 8/30/13 09:54:00  741019  3597768   M           No
4985        T237  9/1/13 11:07:00  740871  3597861   M           No
4986        T237 8/28/13 09:51:00  740954  3597626   M           No
4987        T237  8/1/13 19:07:00  740880  3597862   M           No

Solution

  • One approach would be to use pointDistance from raster and lag from dplyr:

    library(dplyr)
    library(raster)
    result <- data %>%
      mutate(DateTime = as.POSIXct(paste(Date,Time), format = "%m/%d/%y %H:%M:%S")) %>%
      dplyr::select(TelemetryID, Sex, Translocated, Easting, Northing, DateTime) %>%
      arrange(DateTime) %>%
      group_by(TelemetryID) %>%
      mutate( Dist = pointDistance(cbind(Easting, Northing),
                                   cbind(lag(Easting), lag(Northing)),
                                   lonlat = FALSE),
              Elapsed = as.numeric(difftime(DateTime,lag(DateTime),units = "days")),
              DistPerDay = Dist / Elapsed) 
    result
    #   TelemetryID Sex   Translocated Easting Northing DateTime              Dist Elapsed DistPerDay
    #   <fct>       <fct> <fct>          <int>    <int> <dttm>               <dbl>   <dbl>      <dbl>
    # 1 T237        M     No            740872  3597872 2013-07-29 19:41:00  NA     NA          NA   
    # 2 T237        M     No            740862  3597862 2013-07-30 10:03:00  14.1    0.599      23.6 
    # 3 T237        M     No            740874  3597862 2013-07-31 10:37:00  12      1.02       11.7 
    # 4 T237        M     No            740880  3597862 2013-08-01 19:07:00   6      1.35        4.43
    # 5 T237        M     No            740860  3597865 2013-08-03 10:58:00  20.2    1.66       12.2 
    # 6 T237        M     No            740985  3597695 2013-08-05 09:20:00 211.     1.93      109.  
    # 7 T237        M     No            741002  3597627 2013-08-06 10:36:00  70.1    1.05       66.6 
    # 8 T237        M     No            740860  3597865 2013-08-07 10:52:00 277.     1.01      274.  
    # 9 T237        M     No            740893  3597835 2013-08-13 09:49:00  44.6    5.96        7.49
    #10 T237        M     No            740968  3597704 2013-08-14 10:36:00 151.     1.03      146.  
    #11 T237        M     No            741005  3597644 2013-08-14 19:37:00  70.5    0.376     188.  
    #12 T237        M     No            740965  3597710 2013-08-17 19:13:00  77.2    2.98       25.9 
    #13 T237        M     No            740964  3597705 2013-08-18 19:25:00   5.10   1.01        5.06
    #14 T237        M     No            740916  3597720 2013-08-20 18:56:00  50.3    1.98       25.4 
    #15 T237        M     No            741025  3597736 2013-08-21 05:46:00 110.     0.451     244.  
    #16 T237        M     No            740963  3597828 2013-08-27 10:07:00 111.     6.18       17.9 
    #17 T237        M     No            740954  3597626 2013-08-28 09:51:00 202.     0.989     204.  
    #18 T237        M     No            741019  3597768 2013-08-30 09:54:00 156.     2.00       78.0 
    #19 T237        M     No            740871  3597861 2013-09-01 11:07:00 175.     2.05       85.2 
    

    Now you can summarize the data however you'd like, such as with mean, and join to your other data:

    result %>% 
      summarise(AveDist = mean(DistPerDay, na.rm = TRUE)) %>%
      right_join(data2)
    ## A tibble: 7 x 5
    #  TelemetryID AveDist     Area Date    Sex  
    #  <fct>         <dbl>    <dbl> <fct>   <fct>
    #1 T237           85.0 6.64e-11 8/10/11 M    
    #2 T002           NA   5.94e-12 8/7/11  M    
    #3 T003           NA   1.39e-10 8/10/11 M    
    #4 T004           NA   8.18e-12 8/7/11  M    
    #5 T005           NA   4.93e-11 8/9/11  M    
    #6 T006           NA   2.70e-11 8/10/11 M    
    #7 T007           NA   1.17e-10 8/10/11 F