Search code examples
rrowrow-number

how to choose a row when the row number of that row is equal to the value of the other column with duplicates in R?


I have a data frame as follows -

df <- cbind(c(1,1,1,2,2,2,3,3,3,3), c(6,12,18,3,9,12,4,8,12,16),c(3,3,3,2,2,2,4,4,4,4))
colnames(df) <- c("ID","value","index")

I want to get the following result -

df1 <- cbind(c(1,2,3), c(18,9,16),c(3,2,4))

So I basically want to extract (for each ID) the row whose row number is equal to the index for that ID. For example, the 3rd row for ID 1, 2nd row for ID 2 and 4th row for ID 4.

I tried the following code

df1 <- df%>%group_by(ID)%>%filter(index==index)

But it is not working. Please help me to solve this problem.


Solution

  • Use slice to select the index row for each ID.

    library(dplyr)
    df %>% group_by(ID) %>% slice(first(index)) %>% ungroup
    
    #     ID value index
    #  <dbl> <dbl> <dbl>
    #1     1    18     3
    #2     2     9     2
    #3     3    16     4
    

    This can be written in data.table and base R as :

    library(data.table)
    setDT(df)[, .SD[first(index)], ID]
    
    #Base R
    subset(df, index == ave(value, ID, FUN = seq_along))
    

    data

    df <- data.frame(ID = c(1,1,1,2,2,2,3,3,3,3), 
                     value = c(6,12,18,3,9,12,4,8,12,16),
                     index = c(3,3,3,2,2,2,4,4,4,4))