Search code examples
rsortingdaterank

Is there an R function that will rank dates/times by other column criteria?


I am interested in changing the date column in dataf into ordered numbers (earliest date=1, second earliest=2... and so on) corresponding to id as in results$order. If an id only shows up once, I would like the order to be 1.

date=c("2012-02-18", "2013-03-01", "2013-04-11", "2013-06-06", "2013-09-20", "2013-07-02")
datef=strptime(date, format="%Y-%m-%d")
dataf=data.frame(id=c(20, 20, 20, 21, 21, 22), 
              date=datef, 
              service=c("web", "phone", "person", "phone", "web", "web"))
> dataf
  id       date service
1 20 2012-02-18     web
2 20 2013-03-01   phone
3 20 2013-04-11  person
4 21 2013-06-06   phone
5 21 2013-09-20     web
6 22 2013-07-02     web

I am having a hard time even finding the correct wording to search for an answer to this dilemma. Am I looking to coerce? or index? the dataf$dates into results$order below?

results=data.frame(id=c(20, 20, 20, 21, 21, 22), 
                   order=c(1,2,3,1,2,1), 
                   service=c("web", "phone", "person", "phone", "web", "web"))

> results
  id order service
1 20     1     web
2 20     2   phone
3 20     3  person
4 21     1   phone
5 21     2     web
6 22     1     web

Solution

  • With data.table:

    library(data.table)
    
    setDT(dataf)
    
    setorder(dataf, id, date)
    dataf[, order := 1:.N, by = id]
    > dataf
       id       date service order
    1: 20 2012-02-18     web     1
    2: 20 2013-03-01   phone     2
    3: 20 2013-04-11  person     3
    4: 21 2013-06-06   phone     1
    5: 21 2013-09-20     web     2
    6: 22 2013-07-02     web     1