I have a dataframe like this.
df <- data.frame(
id = c(rep("A", 5), rep("B", 5)),
date = as.Date(as.Date("2022-6-1"):as.Date("2022-6-10"), origin="1970-01-01"),
lon = 101:110,
lat = 1:10
)
> df
id date lon lat
1 A 2022-06-01 101.01 1.01
2 A 2022-06-02 102.01 2.01
3 A 2022-06-03 103.01 3.01
4 A 2022-06-04 104.01 4.01
5 A 2022-06-05 105.01 5.01
6 B 2022-06-06 106.01 6.01
7 B 2022-06-07 107.01 7.01
8 B 2022-06-08 108.01 8.01
9 B 2022-06-09 109.01 9.01
10 B 2022-06-10 110.01 10.01
What I want to do is to calculate the daily traveled distance for each group A and B, and store them in a new column called dist
.
I figured out that using dplyr::lag
and geosphere::distGeo
will help, so I tried the following code.
df %>%
group_by(id) %>%
arrange(date, .by_group = TRUE) %>%
mutate(dist = distGeo(.[, c(lon, lat)],
lag(.[, c(lon, lat)], default = first(.[, c(lon, lat)]))))
but this did not work.
Error in `mutate()`:
! Problem while computing `dist = distGeo(...)`.
ℹ The error occurred in group 1: id = "A".
Caused by error in `vectbl_as_col_location()`:
! Must subset columns with a valid subscript vector.
✖ Can't convert from `j` <double> to <integer> due to loss of precision.
i guess there is some syntax errors in mutate
, but how can I solve this?
It is probably best to copy the lon/lat-values of the previous day to a separate column, and then do the calculation rowwise:
library(tidyverse)
library(geosphere)
df <- data.frame(
id = c(rep("A", 5), rep("B", 5)),
date = as.Date(as.Date("2022-6-1"):as.Date("2022-6-10"), origin="1970-01-01"),
lon = 101:110,
lat = 1:10
)
df %>% group_by(id) %>%
mutate(across(c(lon, lat), lag, order_by = date, .names = "prev_{.col}")) %>%
rowwise() %>%
mutate(dist = distGeo(c(lon, lat), c(prev_lon, prev_lat))) %>%
ungroup()
#> # A tibble: 10 × 7
#> id date lon lat prev_lon prev_lat dist
#> <chr> <date> <int> <int> <int> <int> <dbl>
#> 1 A 2022-06-01 101 1 NA NA NA
#> 2 A 2022-06-02 102 2 101 1 156876.
#> 3 A 2022-06-03 103 3 102 2 156829.
#> 4 A 2022-06-04 104 4 103 3 156759.
#> 5 A 2022-06-05 105 5 104 4 156666.
#> 6 B 2022-06-06 106 6 NA NA NA
#> 7 B 2022-06-07 107 7 106 6 156409.
#> 8 B 2022-06-08 108 8 107 7 156246.
#> 9 B 2022-06-09 109 9 108 8 156060.
#> 10 B 2022-06-10 110 10 109 9 155851.
Created on 2022-06-15 by the reprex package (v2.0.1)