Search code examples
rfunctionvectoroutputxts

Why I'm getting different results in ks.boot output when applying to vectors andto xts objects?


I'm working with ks.boot function of Matching package,and I'm having different output when applying the function to a vector and to an xts objects despite the data is the same. Can anybody give a clue about it?

A <- as.vector(seq(1:20))
B <- as.vector(seq(from = 2, to = 40, by =2))
require(Matching)
library(xts)
dates <- seq(as.Date("2000-01-01"), length = 20, by = "days")
C <- as.xts(A,dates)
D <- as.xts(B,dates)


The output:

ks.boot(B,A, alternative = "t")
$ks.boot.pvalue
[1] 0.009

$ks

    Two-sample Kolmogorov-Smirnov test

data:  Tr and Co
D = 0.5, p-value = 0.01348
alternative hypothesis: two-sided


$nboots
[1] 1000

attr(,"class")
[1] "ks.boot"

ks.boot(D,C, alternative = "t")
$ks.boot.pvalue
[1] 0.672

$ks

    Two-sample Kolmogorov-Smirnov test

data:  Tr and Co
D = 0.75, p-value = 2.601e-05
alternative hypothesis: two-sided


$nboots
[1] 1000

attr(,"class")
[1] "ks.boot"
Warning messages:
1: In c.xts(Tr, Co) : mismatched types: converting objects to numeric
2: In c.xts(x, y) : mismatched types: converting objects to numeric


Solution

  • The problem is that xts objects (and the zoo objects they're based on) are always ordered. The ks.boot() function randomly samples and re-orders the observations, but that's not possible with xts objects.

    For example:

    A <- 1:10
    dates <- as.Date("2000-01-01") + A
    C <- xts(A, dates)
    
    # sample index
    set.seed(21)
    (i <- sample(1:10, 10, replace = TRUE))
    ## [1]  1  3  9 10  5  3  4 10  6  8
    
    # subset xts
    C[i,]
    ##            [,1]
    ## 2000-01-02    1
    ## 2000-01-04    3
    ## 2000-01-04    3
    ## 2000-01-05    4
    ## 2000-01-06    5
    ## 2000-01-07    6
    ## 2000-01-09    8
    ## 2000-01-10    9
    ## 2000-01-11   10
    ## 2000-01-11   10
    

    Notice that the data is sampled with replacement, but the index is sampled with the observation, so the original order is preserved.

    The solution to your problem is to use coredata() on the data you pass to ks.boot().

    ks.boot(coredata(D), coredata(C), alternative = "t")
    ## $ks.boot.pvalue
    ## [1] 0.006
    ## 
    ## $ks
    ## 
    ##  Two-sample Kolmogorov-Smirnov test
    ## 
    ## data:  Tr and Co
    ## D = 0.5, p-value = 0.01348
    ## alternative hypothesis: two-sided
    ## 
    ## 
    ## $nboots
    ## [1] 1000
    ## 
    ## attr(,"class")
    ## [1] "ks.boot"