Search code examples
rtime-seriesposixct

How to transfer POSIXct data with multiple variables into time series format


I have data with multiple variables. The str of my data is as follows:

 tibble [2,859 × 92] (S3: tbl_df/tbl/data.frame)
  $ Date             : POSIXct[1:2859], format: "2010-04-01" "2010-04- 
 02" "2010-04-05" "2010-04-06" ...
   $ Num              : num [1:2859] 1 2 3 4 5 6 7 8 9 10 ...
  $ Price      : num [1:2859] 3158 3158 3159 3148 3119 ... `

I would like to transfer the data into a time-series format. I tried some solutions but did not work (e.g., How can I transform a dataframe with POSIXct dates into a time series? I got this error: DF <- data.frame(FinDat = date, FinDat$Price = sample(100, length(date), TRUE)) Error: unexpected '=' in "DF <- data.frame(FinDat = date, FinDat$Price =")

An example of my data is:

 structure(list(Date = structure(c(1270080000, 1270166400, 1270425600, 
 1270512000, 1270598400, 1270684800, 1270771200, 1271030400, 1271116800, 
 1271203200), tzone = "UTC", class = c("POSIXct", "POSIXt")), 
Price = c(3157.957, 3157.957, 3158.681, 3148.222, 3118.709, 
3145.347, 3129.263, 3161.251, 3166.183, 3164.966)), row.names = c(NA, 
 -10L), class = c("tbl_df", "tbl", "data.frame"))

I would like to use autoplot function and auto.arima


Solution

  • In a comment the poster indicated that they want to plot the series and use arima (which uses ts) so we need to convert it to a regularly spaced series. Convert it to zoo and then convert the times to year + fraction where fraction = 0, 1/N, 2/N, ..., (N-1)/N where N is the maximum number of points per year giving tt. We can use that with arima. We will need to have at least 2 years worth of data to perform certain analyses although the code below will work with less.

    library(zoo)
    
    z <- read.zoo(dat)
    plot(z)  # or use autoplot(z) with ggplot2 or xyplot(z) with lattice
    
    zz <- z   
    yr <- as.integer(as.yearmon(time(zz)))
    N <- max(table(yr))
    time(zz) <- yr + (ave(yr, yr, FUN = seq_along) - 1) / N
    tt <- as.ts(zz)
    
    tt
    ## Time Series:
    ## Start = c(2010, 1) 
    ## End = c(2010, 10) 
    ## Frequency = 10 
    ##  [1] 3157.957 3157.957 3158.681 3148.222 3118.709 3145.347 3129.263 3161.251
    ##  [9] 3166.183 3164.966
    

    screenshot