Search code examples
rlapplyhead

Delete tail of data by group in R


I have a data frame similar to

df <- data.frame(group=c("a", "b"), value=1:16,trim=rep(1:2))

I am trying to figure out how I can remove the last rows of each group. The number of rows to remove from each group is defined in the "trim" variable.
I have figured out how to remove a specified number of of rows from all groups using

x<-do.call("rbind", lapply(split(df, df$group), head,-2))

However, I can't seem to figure how I'd remove the number of rows from a group specified in the "trim" column. In other words, I would like group a to have the last row trimmed and group b the last 2 rows trimmed.


Solution

  • Here is a method using data.table (borrowing from @42's method):

    library(data.table)
    setDT(df)
    df[, head(.SD, -trim[1]), by=group]
    

    Which outputs:

        group value trim
     1:     a     1    1
     2:     a     3    1
     3:     a     5    1
     4:     a     7    1
     5:     a     9    1
     6:     a    11    1
     7:     a    13    1
     8:     b     2    2
     9:     b     4    2
    10:     b     6    2
    11:     b     8    2
    12:     b    10    2
    13:     b    12    2