I am struggling to design an efficient solution that run over 14*10e6 records
and that is able to assign each element_id
the difference
(-
) against its previous element_id
. Clearly, for each element_id == 1
the delta is always equal to NA
because it has no previous element to be compared to.
Considering a data.frame like the following:
set.seed(1234)
ID <- c(rep(1, 6), rep(2, 5))
element_id <- c(seq.int(1, 6), seq.int(1, 5))
degree <- as.integer(runif(11, 0, 360)) #angular degrees goes from 0 to 359 because 0 is also 360.
mydf <- data.frame(ID, element_id, degree)
What differ this quesiton from other related to difference between consecutives rows is that if element_id
i
is equal 350
and the element_id
i+1
is equal to 10
, the difference should just be 20
.
You can try function getDifference()
. Function getDifference()
:
180
to that difference360
(%% 360
) and subtract 180
Code:
# Function to calculate difference in degrees
getDifference <- function(degreeA = 0, degreeB = 0) {
(degreeA - degreeB + 180) %% 360 - 180
}
# Test function
getDifference(10, 350)
# [1] 20
getDifference(350, 10)
# [1] -20
Apply to OPs data
# 1. Get difference with previous row (data.table shift)
# 2. For each ID is done using data.table by
library(data.table)
setDT(mydf)
mydf[, degreeDiff := getDifference(degree, shift(degree)), ID]
# ID element_id degree degreeDiff
# 1: 1 1 40 NA
# 2: 1 2 224 -176
# 3: 1 3 219 -5
# 4: 1 4 224 5
# 5: 1 5 309 85
# 6: 1 6 230 -79
# 7: 2 1 3 NA
# 8: 2 2 83 80
# 9: 2 3 239 156
#10: 2 4 185 -54
#11: 2 5 249 64