I have the following data, how to calculate the difference between height and height[0] grouped by id? E.g. group by id, then the first heightdiff will be 0, then next will be height[1]-height[0], etc. thx. Such as, by using zoo package or diff
structure(list(id = c(80006L, 80006L, 80006L, 80006L, 80006L,
80006L, 80006L, 80006L, 80006L, 80006L, 80006L, 80006L, 80006L,
80006L, 80006L, 80016L, 80016L, 80016L, 80016L, 80016L, 80016L,
80024L, 80024L, 80024L, 80024L, 80024L, 80024L, 80024L, 80024L,
80024L), group = c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L), height = c(97.12, 101.35, 102.39, 103.49, 101.64,
105.88, 109.31, 107.37, 115.08, 116.83, 119.03, 117.01, 122.57,
132.27, 162.08, 105.01, 108.13, 115.58, 122.46, 130.33, 148.52,
89.78, 95.27, 98.99, 98.55, 100.84, 108.46, 109.49, 115.75, 118.52
)), row.names = c(NA, 30L), class = "data.frame")
R uses 1 origin indexing (rather than 0). We can use ave
to subtract the first height from all other heights within id
. No packages are used.
transform(dat, hdiff = ave(height, id, FUN = function(x) x - x[1]))
Alternately, with dplyr we can write:
library(dplyr) # version 1.1.0 or later
dat %>%
mutate(hdiff = height - first(height), .by = id)