Search code examples
rdatetimedataframeposixlt

How to change a dataframe including Date-time with daily resolution to half-hourly resolution


I have a data.frame like

   WWH        V1        V2        V3        Names
  2018-01-01 0.3240454 0.4044979 0.6208009     a
  2018-01-01 0.7240454 0.6044979 0.9208009     b
  2018-01-01 0.6124702 0.9391351 0.1459288     c
  2018-01-02 0.5754003 0.9088237 0.7105769     a
  2018-01-02 0.6947945 0.1100394 0.4810563     b
  2018-01-02 0.3207489 0.4254129 0.1989616     c

in which the resolution of Date-time is daily. I need to change the resolution of Date-time to half hourly. So basically I need to repeat each row 48 times for which all the columns stay consistent except the first column that will get half hourly time values for the same date

  WWH                 V1        V2        V3        Names
  2018-01-01   00:00:00     0.3240454 0.4044979 0.6208009     a
  2018-01-01   00:30:00     0.3240454 0.4044979 0.6208009     a
  2018-01-01   01:00:00     0.3240454 0.4044979 0.6208009     a

. . .

  2018-01-02   21:30:00     0.3207489 0.4254129 0.1989616     c
  2018-01-02   22:00:00     0.3207489 0.4254129 0.1989616     c
  2018-01-02   22:30:00     0.3207489 0.4254129 0.1989616     c
  2018-01-02   23:00:00     0.3207489 0.4254129 0.1989616     c
  2018-01-02   23:30:00     0.3207489 0.4254129 0.1989616     c

here is reproducible code

WWH<-seq(as.POSIXlt("2018/1/1"), as.POSIXlt("2018/1/5"), "days")
Names<-c("a","b","c","d","e")
A1<- cbind("Date"=rep(WWH[1],5),as.data.frame(matrix(runif(15),5,3)),"Names"=Names)
A2<-cbind("Date"=rep(WWH[2],3),as.data.frame(matrix(runif(9),3,3)),"Names"=Names[1:3])
A3<-cbind("Date"=rep(WWH[3],2),as.data.frame(matrix(runif(2),2,3)),"Names"=Names[4:5])
df<-rbind(A1,A2,A3)

Solution

  • Here's a solution using two steps with rep() and seq().

    Data:

    WWH<-seq(as.POSIXlt("2018/1/1"), as.POSIXlt("2018/1/5"), "days")
    Names<-c("a","e","r","c","u")
    df <- cbind(WWH,as.data.frame(matrix(runif(15),5,3)),Names)
    

    First we clone all rows of the dataframe 48 times to account for 48 half hours.

    df.exp <- df[rep(row.names(df), each = 48), ]
    

    Then we replace WWH by a sequence of half hours, beginning with the first day, ending at 23:30 of the last day:

    df.exp$WWH <- seq(
      from=df$WWH[1],
      to=df$WWH[nrow(df)] + 84600,
      by=1800
    )
    

    Result:

    > head(df.exp)
                        WWH       V1         V2        V3 Names
    1   2018-01-01 00:00:00 0.639078 0.01123183 0.4661781     a
    1.1 2018-01-01 00:30:00 0.639078 0.01123183 0.4661781     a
    1.2 2018-01-01 01:00:00 0.639078 0.01123183 0.4661781     a
    1.3 2018-01-01 01:30:00 0.639078 0.01123183 0.4661781     a
    1.4 2018-01-01 02:00:00 0.639078 0.01123183 0.4661781     a
    1.5 2018-01-01 02:30:00 0.639078 0.01123183 0.4661781     a
    
    > tail(df.exp)
                         WWH        V1        V2        V3 Names
    5.42 2018-01-05 21:00:00 0.1457907 0.5508916 0.7658603     u
    5.43 2018-01-05 21:30:00 0.1457907 0.5508916 0.7658603     u
    5.44 2018-01-05 22:00:00 0.1457907 0.5508916 0.7658603     u
    5.45 2018-01-05 22:30:00 0.1457907 0.5508916 0.7658603     u
    5.46 2018-01-05 23:00:00 0.1457907 0.5508916 0.7658603     u
    5.47 2018-01-05 23:30:00 0.1457907 0.5508916 0.7658603     u
    

    Bibliography:

    Replicate each row of data.frame and specify the number of replications for each row

    Create a time series by 30 minute intervals

    How to subtract/add days from/to a date?


    Edit: Here's a dplyr version using interaction to create a grouping variable:

    WWH<-seq(as.POSIXlt("2018/1/1"), as.POSIXlt("2018/1/5"), "days")
    Names<-c("a","b","c","d","e")
    A1<- cbind("Date"=rep(WWH[1],5),as.data.frame(matrix(runif(15),5,3)),"Names"=Names)
    A2<-cbind("Date"=rep(WWH[2],3),as.data.frame(matrix(runif(9),3,3)),"Names"=Names[1:3])
    A3<-cbind("Date"=rep(WWH[3],2),as.data.frame(matrix(runif(2),2,3)),"Names"=Names[4:5])
    df<-rbind(A1,A2,A3)
    
    df.exp <- df[rep(row.names(df), each = 48), ]
    
      df.exp <- df.exp %>%
      mutate(temp = droplevels(interaction(df.exp$Date, df.exp$Names))) %>%
      group_by(temp) %>%
      mutate(Datetime = seq(
        from = unique(Date),
        to = unique(Date) + 84600,
        by = 1800
      )) %>%
      ungroup() %>%
      select(-(temp))
    
    tail(df.exp)
    
    # A tibble: 6 x 6
            Date        V1        V2        V3  Names            Datetime
          <dttm>     <dbl>     <dbl>     <dbl> <fctr>              <dttm>
    1 2018-01-03 0.4327316 0.4327316 0.4327316      e 2018-01-03 21:00:00
    2 2018-01-03 0.4327316 0.4327316 0.4327316      e 2018-01-03 21:30:00
    3 2018-01-03 0.4327316 0.4327316 0.4327316      e 2018-01-03 22:00:00
    4 2018-01-03 0.4327316 0.4327316 0.4327316      e 2018-01-03 22:30:00
    5 2018-01-03 0.4327316 0.4327316 0.4327316      e 2018-01-03 23:00:00
    6 2018-01-03 0.4327316 0.4327316 0.4327316      e 2018-01-03 23:30:00