Search code examples
rappenddata-manipulation

Append first row by group to original data


I have data with a grouping variable "ID" and some values:

ID, Value
1, 1
1, 2
1, 3
1, 4
2, 5
2, 6 
2, 7
2, 8

Within each group, I want to append the first row after the last row. Desired result:

ID, Value
1, 1
1, 2
1, 3
1, 4
1, 1 # First row of ID 1 inserted as last row in the group
2, 5
2, 6 
2, 7
2, 8
2, 5 # First row of ID 2 inserted as last row in the group

I have 5000 row of this.


Solution

  • Here is a base R method:

    do.call(rbind, 
        lapply(split(dat, dat$ID), \(id_df) rbind(id_df, id_df[1,]))
    ) 
    #      ID Value
    # 1.1   1     1
    # 1.2   1     2
    # 1.3   1     3
    # 1.4   1     4
    # 1.5   1     1
    # 2.5   2     5
    # 2.6   2     6
    # 2.7   2     7
    # 2.8   2     8
    # 2.51  2     5
    

    It does give you slightly strange row names - if you care about that you can wrap it in (or pipe it to) tibble::as_tibble(), which removes row names altogether.

    Alternatively you could do data.table::rbindlist(lapply(split(dat, dat$ID), \(id_df) rbind(id_df, id_df[1,]))), becausedata.table also does not use row names.

    Data

    dat  <- read.csv(text = "ID, Value
    1, 1
    1, 2
    1, 3
    1, 4
    2, 5
    2, 6 
    2, 7
    2, 8", h=T)