Search code examples
rlmquantmodquantitative-finance

Error with lm and quantmod in R


I seem to be getting an error in R when trying to write a simple linear regression based pairs trading code. I suspect this may be an error coming from the downloaded data? However, I'm unsure of whether I am right or how to deal with the bug. As you can probably see, I am relatively new to this sort of thing in R. Any help with this would be extremely appreciated.

symbols <- c("GOLDBEES.NS", "NIFTYBEES.NS")
getSymbols(symbols)

#[1] "GOLDBEES.NS"  "NIFTYBEES.NS"
startT  <- "2011-01-01"
endT    <- "2014-01-01"
rangeT  <- paste(startT,"::",endT,sep ="")
tGOLDBEES   <- GOLDBEES.NS[,6][rangeT]
tNIFTYBEES   <- NIFTYBEES.NS[,6][rangeT]
startO  <- "2014-02-01"
endO <- "2016-04-01"
rangeO  <- paste(startO,"::",endO,sep ="")
oGOLDBEES   <- GOLDBEES.NS[,6][rangeO]
oNIFTYBEES   <- NIFTYBEES.NS[,6][rangeO]
pdtGOLDBEES <- diff(tGOLDBEES)[-1]
pdtNIFTYBEES <- diff(tNIFTYBEES)[-1]
model <- lm(pdtGOLDBEES ~ pdtNIFTYBEES - 1)
#Error in model.frame.default(formula = pdtGOLDBEES ~ pdtNIFTYBEES - 1,         
#:   variable lengths differ (found for 'pdtNIFTYBEES')

Solution

  • As you have already noticed, nrow(GOLDBEES.NS) gives 1742 while nrow(NIFTYBEES.NS) gives 1925. Let's have a closer look:

    a <- attr(GOLDBEES.NS, "index")
    a <- as.integer((a - a[1]) / 86400)    # number of days since 2008-01-01
    b <- attr(NIFTYBEES.NS, "index")
    b <- as.integer((b - b[1]) / 86400)    # number of days since 2008-01-01
    

    You don't really have consecutive daily observations. We should only work with data from common dates.

    GOLDBEES.NS <- GOLDBEES.NS[a %in% b]
    NIFTYBEES.NS <- NIFTYBEES.NS[b %in% a]
    nrow(GOLDBEES.NS)  # 1740
    nrow(NIFTYBEES.NS)  # 1740
    

    Now you can use your code:

    startT  <- "2011-01-01"
    endT    <- "2014-01-01"
    rangeT  <- paste(startT,"::",endT,sep ="")
    tGOLDBEES   <- GOLDBEES.NS[,6][rangeT]
    tNIFTYBEES   <- NIFTYBEES.NS[,6][rangeT]
    pdtGOLDBEES <- diff(tGOLDBEES)[-1]
    pdtNIFTYBEES <- diff(tNIFTYBEES)[-1]
    model <- lm(pdtGOLDBEES ~ pdtNIFTYBEES - 1)
    
    #Call:
    #lm(formula = pdtGOLDBEES ~ pdtNIFTYBEES - 1)
    
    #Coefficients:
    #pdtNIFTYBEES  
    #     -0.6383