Search code examples
rloopssignificance

Wilcoxon Test in Loop


I want to perform Wilcoxon test for my data (AllData) all in integer. Here's my raw data :

Date v1 v2 v3 v4 v5 v6 v7 v8
1   2014-01-05  39  4   84  75  41  6   83  610
2   2014-01-12  40  6   86  77  44  6   84  765
3   2014-01-19  39  5   82  73  40  6   81  713
4   2014-01-26  37  5   100 71  39  6   90  685
5   2014-02-02  39  5   83  70  37  5   79  601
6   2014-02-09  44  6   82  78  40  6   78  535
7   2014-02-16  41  5   76  76  40  7   78  582
8   2014-02-23  40  5   74  72  42  6   81  568
9   2014-03-02  35  4   81  71  39  6   78  502

Here's my coding so far and it works fine

#calculate basefitMAE 
basefit.model1 <- Arima(AllData$v8,order=c(0,1,2)) 
inSampleBaseFitMAE <- mean(abs(basefit.model1$residuals))

#calculate advancedfitMAE
AllModel <- AllData[c(2:8)]
allfit <- data.frame(inSampleAdvancedFitMAE = rep(NA, length(AllModel)))
for (i in seq_along(AllModel[1,])) {
  advancedfit.model <- Arima(AllData$x1,order=c(0,1,2),xreg=AllModel[,i])
  allfit$inSampleAdvancedFitMAE[i] <- mean(abs(advancedfit.model$residuals))
}

allfit <- cbind(allfit,inSampleBaseFitMAE)

#measure relative MAE 
allfit$relativeMAE <- (allfit$inSampleAdvancedFitMAE)/(allfit$inSampleBaseFitMAE)

The basic formula for Wilcoxon is this:

wilcox.test(abs(basefit.model$residuals),abs(advancedfit.model$residuals),paired=TRUE)

Now I just need to do the Wilcoxon test in the loop. But I'm confused about how to do this since I don't have the absolute residuals for all the data, only have the mean residuals at the moment.


Solution

  • I needa create more data to get the fit going, so:

    AllData <- data.frame(Date=as.Date("2014-01-05") + seq(0,700,by=7),
    matrix(rnbinom(101*8,mu=10,size=1),ncol=8))
    
    colnames(AllData)[2:9] <- paste0("v",1:8)
    

    Hope I got you correct, and also note the typo Arima(AllData$x1..) in the code you posted, x1 doesn't exist in the table you showed.

    basefit.model1 <- Arima(AllData$v8,order=c(0,1,2)) 
    inSampleBaseFitMAE <- mean(abs(basefit.model1$residuals))
    
    allfit <- data.frame(var = colnames(AllData)[2:8],
    inSampleAdvancedFitMAE =NA,p.value=NA,
    inSampleBaseFitMAE=inSampleBaseFitMAE,stringsAsFactors=FALSE)
    
    for (i in seq_along(allfit$var)) {
      advancedfit.model <- Arima(AllData$v8,order=c(0,1,2),xreg=AllData[,allfit$var[i]])
      allfit$inSampleAdvancedFitMAE[i] <- mean(abs(advancedfit.model$residuals))
      test = wilcox.test(advancedfit.model$residuals,basefit.model1$residuals,paired=TRUE)
      allfit$p.value[i] <- test$p.value
    }
    
    allfit$relativeMAE <- (allfit$inSampleAdvancedFitMAE)/(allfit$inSampleBaseFitMAE)
    allfit
    
      var inSampleAdvancedFitMAE     p.value inSampleBaseFitMAE relativeMAE
    1  v1               7.919351 0.025809584           7.955079   0.9955089
    2  v2               7.859983 0.954075356           7.955079   0.9880459
    3  v3               7.968860 0.165886368           7.955079   1.0017324
    4  v4               7.796261 0.128247572           7.955079   0.9800356
    5  v5               7.940206 0.002050978           7.955079   0.9981304
    6  v6               7.960803 0.122403761           7.955079   1.0007195
    7  v7               7.929170 0.111342188           7.955079   0.9967432