Search code examples
rdaterepeatsequential-number

Sequential numbering based on another column (date) R


I need to make a column with sequential numbers based on the date. There are multiple rows of the same date, and should look like so:

       nest in.temp out.temp age  date
1 501 (913)    18.0     11.5  0 06/02
2 501 (913)    17.5     12.0  0 06/02
3 501 (913)    17.5     12.0  0 06/02
4 501 (913)    17.5     12.5  0 06/02
5 501 (913)    17.5     14.0  1 06/03
6 501 (913)    18.0     13.0  1 06/03

However, it is outputting NA warnings. The code I am using is for a folder containing multiple files that need the same output, so it would be helpful if the code worked in the loop. However, I'll be combining the data frames so it could be done after the fact as well.

nestlist1 <- lapply(1:length(nestlist), function(z) {
  #creates nests in loop
  k <- nestlist[[z]]
  k <- k[!is.na(k$In), ]
  #separates time and date from one another
  k$time <- c(format(as.POSIXct(strptime(k$DateTime, "%Y-%m-%d %H:%M:%S", tz="")),
                     format="%H:%M"))
  k$date <- c(format(as.POSIXct(strptime(k$DateTime, "%Y-%m-%d %H:%M", tz="")),
                     format="%m/%d"))
  k$time <- strptime(k$time, "%H:%M")
  #sets parameters for temperature being observed
  a <- lapply(unique(k$date), function(i)
    d <- k[k$date == i & k$time >= "2021-01-14 03:00:00 MDT" & 
             k$time <=  "2021-01-14 06:00:00 MDT", ])
  #names based on the date
  names(a) <- gsub( ".xlsx", "", unique(k$date))
  #number of variables at each date
  rn <- lapply(a, nrow)
  #????
  a <- a[rn > 0]
  b <- unlist(lapply(a, function(x) {
    x$In
  }))
  d <- unlist(lapply(a, function(x) {
    x$Out
  }))
  c <- unlist(lapply(a, function(x) {
    x$date
  }))
  e <- unlist(lapply(a, function(x) {
    x$date
  }))
  e <- as.data.frame(e)
  c <- as.data.frame(c)
  b <- as.data.frame(b)
  d <- as.data.frame(d)
  b <- cbind(b, d)
  b <- cbind(b, c)
  b <- cbind(b, e)
  colnames(b)[1] <- 'in.temp' 
  colnames(b)[2] <- 'out.temp'
  colnames(b)[3] <- 'day'
  colnames(b)[4] <- 'date'
  is.num <- sapply(b, is.numeric)
  b[is.num] <- lapply(b[is.num], round, 1)
  b$day <- as.numeric(b$day)
  head(b)
  xx <- data.frame(nest=names(nestlist)[z], in.temp= b$in.temp, 
                   out.temp=b$out.temp, age=b$day, date=b$date)
  return(xx)
})


Solution

  • Using tidyverse and lubridate:

    library(tidyverse)
    library(lubridate)
    
    read_delim('example.txt', delim = ' ') %>% 
      mutate(
        date = parse_date_time(date, 'md'),
        age = day(date) - day(date)[1L] 
      )
    

    The parse_date_time takes the 06/02 form date and converts it to 0000-06-02. The day function from lubridate then extracts day out of date and the - day(date)[1L] portion normalizes to the first row, resulting in 0, 0, 0, 0, 1, 1.


    I used this as the input example.txt:

    nest in.temp out.temp date
    501 18.0 11.5 06/02
    501 17.5 12.0 06/02
    501 17.5 12.0 06/02
    501 17.5 12.5 06/02
    501 17.5 14.0 06/03
    501 18.0 13.0 06/03