Search code examples
rstatisticssignificance

Wilcoxon rank sum results: same pvalue over and over?


I'm comparing pairs of column data with the wilcoxon rank sum test and I was getting the exact same pvalue for the majority of comparisons. I was wondering if judging from the data whether I made a mistake or everything is alright. Here are some of the comparisons.

This is the comparison that I used

wtresult<-wilcox.test(datachunk[,i],datachunk[,(i+1)],paired=FALSE)

And here are the results with the data used above it.

X1     X2     X3                      
339.53 354.11 435.56 425.34 434.64 436.08 
 X1    X2    X3                   
312.1 282.2 281.6    NA    NA    NA 

Wilcoxon rank sum test

data:  datachunk[, i] and datachunk[, (i + 1)]
W = 18, p-value = 0.02381
alternative hypothesis: true location shift is not equal to 0

X1     X2     X3                      
161.21 150.01 183.47 201.51 234.70 321.00 
X1    X2    X3                   
501.0 520.1 500.7    NA    NA    NA 

Wilcoxon rank sum test

data:  datachunk[, i] and datachunk[, (i + 1)]
W = 0, p-value = 0.02381
alternative hypothesis: true location shift is not equal to 0

X1     X2     X3                      
247.79 159.64 192.00 262.86 403.33 336.21 
X1    X2    X3                   
60.33 66.04 55.23    NA    NA    NA 

Wilcoxon rank sum test

data:  datachunk[, i] and datachunk[, (i + 1)]
 W = 18, p-value = 0.02381
alternative hypothesis: true location shift is not equal to 0

X1    X2    X3                   
17.12 15.83 16.88 17.61 18.97 45.92 
X1    X2    X3                   
321.8 329.7 334.4    NA    NA    NA 

Solution

  • The test is a little "chunky" for small numbers of observations so if you have a boundary case (all of the first argument values are larger than the second argument values or vise versa) you will get identical p-values and W statistics that are all 0 or some other number (depending on n).

    For a more detailed answer we'd need to see your data or you would need to agree to look at some other data that we can all see.

    Here's an example of code that shows the behavior I'm talking about

    i <- 1
    datachunk <- mtcars[1:5,]
    wilcox.test(datachunk[,i],datachunk[,(i+1)],paired=FALSE)
    
    i <- 2
    wilcox.test(datachunk[,i],datachunk[,(i+1)],paired=FALSE)