Search code examples
rcoordinatesgeospatialwgs84

How can I calculate the rows where a 10 meter (0.01km) increase has occurred?


I have a CSV with 3 columns: time, longitude, latitude. I need to extract the time at every 10 meters (0.01km). I've managed to calculate cumulative distance for each row:

gps <- read.csv("SP1ST1.csv")
gps_sp <- SpatialPoints(cbind(gps$lng,gps$lat))
test <- spDistsN1(gps_sp, gps_sp[1,], longlat=TRUE)

So the output looks like this:

 [1] 0.000000000 0.001586483 0.004574098 0.004493954 0.004887035 0.005405389 0.005930999 0.006443206 0.006991742 0.007595466 0.009693191
 [12] 0.010654023 0.010231435 0.010082614 0.012005496 0.012905777 0.013896484 0.014873557 0.015857558 0.016905208 0.013991941 0.017441699
 [23] 0.017797154 0.018539821 0.019254225 0.019914940 0.020634398 0.021411878 0.022246358 0.023037314 0.023832587 0.024608449 0.023977990

I can see just by looking at the output that my first approximately 0.01km increase is between row 1 and row 11, and the second one is between row 11 and 26.

I need to write a code in R that will find all of these jumps for me, but it's not a jump of exactly 0.01 and it's not evenly distributed through the rows. I also need to link this back to the original "gps" object so I can extract the times that are associated with the ~0.01 increase.

How do I do this?

Edit: Added a data sample below.

sample <- dput(head(gps,30))
filename                         taken_at    lng     lat      gps_altitude
1  20230718_GSL_SP1ST1_4k_01.MOV  14:11:05 -65.36897 49.95216      -31.625
2  20230718_GSL_SP1ST1_4k_01.MOV  14:11:08 -65.36898 49.95218      -31.373
3  20230718_GSL_SP1ST1_4k_01.MOV  14:11:12 -65.36899 49.95220      -31.254
4  20230718_GSL_SP1ST1_4k_01.MOV  14:11:13 -65.36898 49.95220      -31.604
5  20230718_GSL_SP1ST1_4k_01.MOV  14:11:14 -65.36897 49.95221      -31.419
6  20230718_GSL_SP1ST1_4k_01.MOV  14:11:15 -65.36897 49.95221      -31.432
7  20230718_GSL_SP1ST1_4k_01.MOV  14:11:16 -65.36896 49.95222      -31.445
8  20230718_GSL_SP1ST1_4k_01.MOV  14:11:17 -65.36896 49.95222      -31.459
9  20230718_GSL_SP1ST1_4k_01.MOV  14:11:18 -65.36895 49.95222      -31.472
10 20230718_GSL_SP1ST1_4k_01.MOV  14:11:19 -65.36895 49.95223      -31.485
11 20230718_GSL_SP1ST1_4k_01.MOV  14:11:20 -65.36900 49.95225      -31.328
12 20230718_GSL_SP1ST1_4k_01.MOV  14:11:21 -65.36899 49.95226      -31.322
13 20230718_GSL_SP1ST1_4k_01.MOV  14:11:22 -65.36901 49.95225      -31.462
14 20230718_GSL_SP1ST1_4k_01.MOV  14:11:23 -65.36903 49.95224      -31.614
15 20230718_GSL_SP1ST1_4k_01.MOV  14:11:24 -65.36899 49.95227      -31.272
16 20230718_GSL_SP1ST1_4k_01.MOV  14:11:25 -65.36898 49.95228      -31.189
17 20230718_GSL_SP1ST1_4k_01.MOV  14:11:26 -65.36897 49.95229      -31.102
18 20230718_GSL_SP1ST1_4k_01.MOV  14:11:27 -65.36896 49.95230      -31.015
19 20230718_GSL_SP1ST1_4k_01.MOV  14:11:28 -65.36895 49.95230      -30.927
20 20230718_GSL_SP1ST1_4k_01.MOV  14:11:29 -65.36894 49.95231      -30.838
21 20230718_GSL_SP1ST1_4k_01.MOV  14:11:30 -65.36899 49.95229      -32.265
22 20230718_GSL_SP1ST1_4k_01.MOV  14:11:31 -65.36901 49.95232      -31.533
23 20230718_GSL_SP1ST1_4k_01.MOV  14:11:32 -65.36901 49.95232      -31.781
24 20230718_GSL_SP1ST1_4k_01.MOV  14:11:33 -65.36900 49.95233      -31.921
25 20230718_GSL_SP1ST1_4k_01.MOV  14:11:34 -65.36899 49.95234      -32.056
26 20230718_GSL_SP1ST1_4k_01.MOV  14:11:35 -65.36898 49.95234      -32.188
27 20230718_GSL_SP1ST1_4k_01.MOV  14:11:36 -65.36897 49.95235      -32.320
28 20230718_GSL_SP1ST1_4k_01.MOV  14:11:37 -65.36896 49.95236      -32.452
29 20230718_GSL_SP1ST1_4k_01.MOV  14:11:38 -65.36901 49.95236      -31.729
30 20230718_GSL_SP1ST1_4k_01.MOV  14:11:39 -65.36901 49.95237      -31.705

Solution

  • We can set thresholds of 0.01 and then calculate the rows where the output exceeds that threshold. Then we can filter just those rows of interest:

    # Load libraries.
    
    library(tidyverse)
    library(sp)
    
    # Using your sample data as `gps` find the cumulative distance.
    
    gps_sp <- SpatialPoints(cbind(gps$lng,gps$lat))
    test <- spDistsN1(gps_sp, gps_sp[1,], longlat=TRUE)
    
    # Add output values to dataframe.
    
    gps$test <- test
    
    # Find rows that pass the 0.01 value thresholds.
    
    thresholds <- seq(0.01, max(gps$test), by = 0.01)
    
    threshold_indices <- as.data.frame(thresholds) %>%
      mutate(index = map(thresholds, ~which(gps$test >= .x)[1])) %>%
      unnest(cols = c(index))
    
    # Add threshold column.
    
    final_gps <- gps %>%
      mutate(row_id = row_number()) %>%
      mutate(passes_threshold = row_id %in% threshold_indices$index) %>%
      select(-row_id)
    

    Now we have a column with TRUE where it hits the threshold value and FALSE otherwise. Then you could do:

    final_gps %>% 
      filter(passes_threshold == TRUE)
    

    For output:

    | filename                          | taken_at  | lng       | lat      | gps_altitude | test      | passes_threshold |
    |-----------------------------------|-----------|-----------|----------|--------------|-----------|------------------|
    | 20230718_GSL_SP1ST1_4k_01.MOV     | 14:11:20  | -65.36900 | 49.95225 | -31.328      | 0.01023953| TRUE             |
    | 20230718_GSL_SP1ST1_4k_01.MOV     | 14:11:34  | -65.36899 | 49.95234 | -32.056      | 0.02007263| TRUE             |