Search code examples
rfor-loopfloating-pointfloating-accuracy

R Showing Different Results in forward and backward for loop


I get different answers while searching for a certain element in an array if I am searching forward or backwards using a for loop.

Example: Code that gives CORRECT ANSWER

vg   = rep(seq(0.9,1.1,0.01),90)
vals = seq(0.9,1.05,0.01)

for(val in vals){
  idx = c()
  idx = which((vg) %in% (val))
  cat(val,":",length(idx),"\t")
}

This Code gives: 0.9 : 90 0.91 : 90 0.92 : 90 0.93 : 90 0.94 : 90 0.95 : 90 0.96 : 90 0.97 : 90 0.98 : 90 0.99 : 90 1 : 90 1.01 : 90 1.02 : 90 1.03 : 90 1.04 : 90 1.05 : 90

WHICH IS CORRECT. But if I change the seq of the vg variable above using the CODE below:

vg   = rep(seq(1.1,0.9,-0.01),90)
vals = seq(0.9,1.05,0.01)

for(val in vals){
  idx = c()
  idx = which((vg) %in% (val))
  cat(val,":",length(idx),"\t")
}

I get the answer shown below, WHICH SHOWS 0 NUMBER OF ELEMENTS WHILE SEARCHING FOR 0.96, 0.97 etc.

0.9 : 0 0.91 : 0 0.92 : 0 0.93 : 90 0.94 : 90 0.95 : 90 0.96 : 0 0.97 : 0 0.98 : 0 0.99 : 0 1 : 90 1.01 : 90 1.02 : 90 1.03 : 90 1.04 : 90 1.05 : 90

Why is this discrepancy since we are searching for the exactly same elements in both the codes? Is this a R Bug?


Solution

  • To expand on Andrie's comment, this is a floating point problem. To quote from the good book, The R Inferno

    Once we had crossed the Acheron, we arrived in the rst Circle, home of the virtuous pagans. These are people who live in ignorance of the Floating Point Gods. These pagans expect:

    .1 == .3 / 3
    [1] FALSE  
    

    to be true. The virtuous pagans will also expect:

    seq(0, 1, by=.1) == .3
    [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
    

    to have exactly one value that is true.

    In your example if we instead work with integers rather than floating point numbers it works:

    vg   = rep(seq(90,110,1),90)
    vals = seq(90,105,1)
    
    for(val in vals){
      idx = c()
      idx = which((vg) %in% (val))
      cat(val,":",length(idx),"\t")
    }
    
    vg   = rep(seq(110,90,-1),90)
    vals = seq(90,105,1)
    
    for(val in vals){
      idx = c()
      idx = which((vg) %in% (val))
      cat(val,":",length(idx),"\t")
    }
    90 : 90         91 : 90         92 : 90         93 : 90         94 : 90         95 : 90         96 : 90  
    

    The R inferno is a really entertaining and informative reading. I highly recommend it.

    You can also see that WYSINWYG by default by doing:

    options(digits=22)
    .3/3
    [1] 0.09999999999999999167333