Search code examples
rdataframegroup-bycount

Create column representing instance rather than total count


Let's say I have the following dataframe:

ID <- c(15, 25, 90, 1, 23, 543)

animal <- c("fish", "dog", "fish", "cat", "dog", "fish")

df <- data.frame(ID, animal)

How could I create a third column to represent the instance (from top to bottom) that a repeat animal appears? for example, a column "Instance" in the order (1, 1, 2, 1, 2, 3)? I know I can use group_by to receive the total count, but this is not exactly what I'm after. Thanks.


Solution

  • You need row_number() by groups of animal.

    library(dplyr)
    
    df %>%
      mutate(Instance = row_number(), .by = animal)
    
    #    ID animal Instance
    # 1  15   fish        1
    # 2  25    dog        1
    # 3  90   fish        2
    # 4   1    cat        1
    # 5  23    dog        2
    # 6 543   fish        3
    

    With built-in packages, you can use ave:

    ave(df$animal, df$animal, FUN = seq)
    
    # [1] "1" "1" "2" "1" "2" "3"