R: Inserting Mid Values of Data Frame Row Pairs

I have a series of coordinates from Strava which are recorded every 2.5 minutes and then add these to a QGIS map. I want to interpolate the points in between by take the mean of the latitude and longitude of each pair.

I know I could use a for loop, but I"d rather use one of the apply family of functions. I know I need to take the current row and then next row for all but the last row.

gpsSmall is a data.frame looks like this

activity_no lat     lon
----------- ---     ---
1           52.5111 -1.85222
1           52.5111 -1.86224
1           52.5111 -1.87226
... etc
2           52.6189 -1.85332
2           52.6284 -1.86332
2           52.6386 -1.87332
... etc

I've then written these functions to create the extra rows which I will rbind onto the end.

splitPoints <- function(point1, point2) {
    meanLatitude = (point1$lat + point2$lat)/2
    meanLongitude = (point1$lon + point2$lon)/2

    point1$lat = meanLatitude
    point1$lon = meanLongitude

    point1
}

newPoints <- sapply(seq_len(nrow(gpsSmall) - 1),
       function(i){
           splitPoints(gpsSmall[i,], gpsSmall[i+1,])
       })

However, newPoints returns a matrix of 3 (the number columns in gpsSmall) x 66 (1 - the number of rows in gpsSmall). What am I doing wrong?

Solution

Not using the apply functions, but something like this may make it a bit easier. Given what I think your problem is, this should do it. I assumed you wanted activity_no to be a grouping mechanism. If not, it's even easier. Just use the approx function as done below on the whole data set instead of splitting it first.

A couple of tidyverse packages:

library(dplyr)
library(purrr)

Load your data snippet:

dat <- tribble(
  ~activity_no, ~lat, ~lon,
  1,           52.5111, -1.85222,
  1,           52.5111, -1.86224,
  1,           52.5111, -1.87226,
  2,           52.6189, -1.85332,
  2,           52.6284, -1.86332,
  2,           52.6386, -1.87332
)

And now just do linear interpolation using ?approx. Setting the length of the interpolation output to n * 2 - 1 basically says there is 1 new value in between each real observation. Since it is linear, that will be the mean. You could tweak the output and get a greater level of interpolation if you wanted it.

dat %>%
  split(dat$activity_no) %>%
  map_dfr( ~ data.frame(activity_no = rep(.$activity_no[1], nrow(.) * 2 - 1),
                lat = approx(.$lat, n = nrow(.) * 2 - 1)$y,
                lon = approx(.$lon, n = nrow(.) * 2 - 1)$y))

   activity_no      lat      lon
1            1 52.51110 -1.85222
2            1 52.51110 -1.85723
3            1 52.51110 -1.86224
4            1 52.51110 -1.86725
5            1 52.51110 -1.87226
6            2 52.61890 -1.85332
7            2 52.62365 -1.85832
8            2 52.62840 -1.86332
9            2 52.63350 -1.86832
10           2 52.63860 -1.87332