I would like to insert a new row after each group (grp) and fill some of the new columns with the next row values and some with previous row values
I'm trying to use:
x<-rbind(setDT(DF), DF[,.SD[.N], grp][, color := shift(color,1L, type = "lag")][, Lat:= shift(Lat,1L, type = "lead")])[order(id)]
on a DF:
a <- c(1,2,3,4,5,6,7,8,9,10)
b <- c(10,20,30,40,50,60,70,80,90,100)
c <- c("a","a","b","b","b","a","a","b","c","c")
d <- c(11,23,67,89,90,100,101,123,200,290)
df <- data.frame(color=a, Lat=b, grp=c, id=d)
I'm probably using the shift() incorrect and I cant seem to make it work.
thanks
expected results to be:
color Lat grp id
1 1 10 a 11
2 2 20 a 23
new row with color from previous row, Lat from next row, grp from previous and if from next
3 3 30 b 67
4 4 40 b 89
5 5 50 b 90
new row as before
6 6 60 a 100
7 7 70 a 101
new row as before
8 8 80 b 123
new row as before
9 9 90 c 200
10 10 100 c 290
We create two lead
columns i.e. next row values of 'Lat', and 'id' before we do the group by rleid
of 'grp'. Here, the rleid
is checking whether the adjacent elements of 'grp' are the same or not. If it is different, it assigns a new id to that element
library(data.table)
setDT(df)[, c("LatN", "idN") := shift(.SD, type = 'lead'), .SDcols = c('Lat', 'id')]
Get the last
observation of the selected columns by the rleid
of 'grp'
tmp <- df[, .(color = color[.N], Lat = LatN[.N], id = idN[.N], grp = grp[.N]),
.(grp1 = rleid(grp))][, grp1 := NULL]
rbind
with the original dataset columns, order
by the 'color' and remove any NA
rows
na.omit(rbind(df[,.(color, Lat, id, grp)], tmp)[order(color)])
# color Lat id grp
# 1: 1 10 11 a
# 2: 2 20 23 a
# 3: 2 30 67 a
# 4: 3 30 67 b
# 5: 4 40 89 b
# 6: 5 50 90 b
# 7: 5 60 100 b
# 8: 6 60 100 a
# 9: 7 70 101 a
#10: 7 80 123 a
#11: 8 80 123 b
#12: 8 90 200 b
#13: 9 90 200 c
#14: 10 100 290 c