Search code examples
rtime-seriesxtszoo

Transform the dataframe to xts object


I have a dataframe (time series) similar to this dummy data:

df <- data.frame(stringsAsFactors=FALSE,
      symbol = c("N2", "NJ", "K-Kl", "K-P3", "K-N", "KP+", "K13", "KS",
                 "KTotal", "P500", "P800", "P23", "P55", "PA", "PKA"),
        date = c("2017-10-12", "2017-10-12", "2017-10-12", "2017-10-12",
                 "2017-10-12", "2017-10-12", "2017-10-12", "2017-10-12",
                 "2017-10-12", "2017-10-12", "2017-10-12", "2017-10-12", "2017-10-12",
                 "2017-10-12", "2017-10-12"),
     open_pr = c(10.2, 2.7, 0.5, 4.5, 2.9, 8.1, 2.3, 1, 43.2, 28.5, 5.8, 6.7,
                 5.7, 0.1, 10),
       gross = c(460L, 121L, 21L, 203L, 130L, 363L, 102L, 45L, 1946L, 1282L,
                 262L, 303L, 256L, 6L, 449L),
     avg_aud = c(19L, 3L, 0L, 5L, 5L, 21L, 4L, 1L, 153L, 92L, 10L, 14L, 6L, 0L,
                 27L),
          ts = c(59L, 32L, 31L, 34L, 57L, 83L, 59L, 28L, 113L, 103L, 53L, 69L,
                 33L, 4L, 87L),
          tv = c(6L, 1L, 0L, 2L, 2L, 7L, 1L, 0L, 49L, 29L, 3L, 5L, 2L, 0L, 9L)
)

head(df)

   symbol       date open_pr gross avg_aud  ts tv
1      N2 2017-10-12    10.2   460      19  59  6
2      NJ 2017-10-12     2.7   121       3  32  1
3    K-Kl 2017-10-12     0.5    21       0  31  0
4    K-P3 2017-10-12     4.5   203       5  34  2
5     K-N 2017-10-12     2.9   130       5  57  2

my snippet

df %>% 
  as.tbl() %>% 
  mutate(date = ymd(date)) %>% 
  as.xts(date_col = date)

error message

Error in as.POSIXlt.character(x, tz, ...) : 
  character string is not in a standard unambiguous format

I would like to convert this dataframe into xts object. something similar to stock market data

library(quamtmod)
x <- getSymbols("GOOG", auto.assign = FALSE)

result :

           GOOG.Open GOOG.High  GOOG.Low GOOG.Close GOOG.Volume GOOG.Adjusted
2007-01-03  231.4944  236.7899  229.0652   232.2842    15513200      232.2842
2007-01-04  232.9847  240.4114  232.6618   240.0686    15877700      240.0686
2007-01-05  239.6910  242.1749  237.5102   242.0209    13833500      242.0209
2007-01-08  242.2693  243.3522  239.5420   240.2276     9570600      240.2276
2007-01-09  241.1565  242.5475  239.0452   241.1814    10832700      241.1814

Solution

  • The code below will give you what you want with dplyr and piping. I'm not sure why everything needs to be done with piping as not every function is built for magrittr pipes. For as.xts you need to reference to the date column with .$ if you want to use piping.

    But the outcome will not be useful. xts transforms the data in a matrix and since Symbol and date are in the matrix the whole matrix will be a character matrix.

    library(xts)
    library(dplyr)
    
    df %>% 
      mutate(date = as.Date(date)) %>% 
      as.xts(order.by = .$date)
    
               symbol   date         open_pr gross  avg_aud ts    tv  
    2017-10-12 "N2"     "2017-10-12" "10.2"  " 460" " 19"   " 59" " 6"
    2017-10-12 "NJ"     "2017-10-12" " 2.7"  " 121" "  3"   " 32" " 1"
    2017-10-12 "K-Kl"   "2017-10-12" " 0.5"  "  21" "  0"   " 31" " 0"
    2017-10-12 "K-P3"   "2017-10-12" " 4.5"  " 203" "  5"   " 34" " 2"
    2017-10-12 "K-N"    "2017-10-12" " 2.9"  " 130" "  5"   " 57" " 2"
    2017-10-12 "KP+"    "2017-10-12" " 8.1"  " 363" " 21"   " 83" " 7"
    2017-10-12 "K13"    "2017-10-12" " 2.3"  " 102" "  4"   " 59" " 1"
    2017-10-12 "KS"     "2017-10-12" " 1.0"  "  45" "  1"   " 28" " 0"
    2017-10-12 "KTotal" "2017-10-12" "43.2"  "1946" "153"   "113" "49"
    2017-10-12 "P500"   "2017-10-12" "28.5"  "1282" " 92"   "103" "29"
    2017-10-12 "P800"   "2017-10-12" " 5.8"  " 262" " 10"   " 53" " 3"
    2017-10-12 "P23"    "2017-10-12" " 6.7"  " 303" " 14"   " 69" " 5"
    2017-10-12 "P55"    "2017-10-12" " 5.7"  " 256" "  6"   " 33" " 2"
    2017-10-12 "PA"     "2017-10-12" " 0.1"  "   6" "  0"   "  4" " 0"
    2017-10-12 "PKA"    "2017-10-12" "10.0"  " 449" " 27"   " 87" " 9"
    

    But if you want something like your example on the bottom with google, use something like below.

    Step 1 is to create a function to create xts timeseries with the symbol in front of the column names. Step 2 split your original data by symbol and create a list to contain all the data in a named list. Step 3 is to use Map to apply the function to the data. After this you can access all the data in the my_data list.

    my_func <- function(x, symbol){
      index <- as.Date(x[["date"]])
      x <- x[, setdiff(colnames(x), c("symbol", "date"))]
      x <- xts::as.xts(x, order.by = index)
      colnames(x) <- paste0(symbol, ".", colnames(x))
      return(x)
    }
    
    my_data <- split(df, df$symbol)
    
    my_data <- Map(my_func, my_data, names(my_data))
    
    head(my_data, 2)
    $`K-Kl`
               K-Kl.open_pr K-Kl.gross K-Kl.avg_aud K-Kl.ts K-Kl.tv
    2017-10-12          0.5         21            0      31       0
    
    $`K-N`
               K-N.open_pr K-N.gross K-N.avg_aud K-N.ts K-N.tv
    2017-10-12         2.9       130           5     57      2