Search code examples
rmodelingr-raster

R: raster comes up with NAs even though values were added


When I make a raster from scratch and use length(is.na(raster)) I get all cells as NA, but if I enter the code raster[] I receive the cell values.

I have used ...vals=0when making the raster to set up a raster with all 0 I have also tried raster[]<-0 which makes all my values 0 but still shows 25 NAs.

library(raster)

size<-5
resol<-5
popseq<-seq(1,100,1)
stage1<-raster(nrow=size,ncol=size,xmn=0,xmx=(size*resol),ymn=0,ymx=(size*resol),resolution=resol,vals=0)
stage1[]<-sample(popseq,size*size, replace=T)
length(is.na(stage1))
#> [1] 25
stage1[1:(size*size)]
#> [1]  48  24  33   6 100  18  31  10  40  55  15  56  27  24  16  78  93  14  95  86  87  56  90  53  55
length(is.na(stage1))
#> [1] 25

This is not just an interesting problem, but I believe it also leads to my "non-conformable arguments"-Errors downstream (I am working on a post regarding that bit of code atm, it is a little more involved).

I may be missing something frustratingly obvious here.


Solution

  • This answer is a little involved, so please bear with me.

    First of all, think about the following code:

    x <- c(1, 2, 3, 4, 5)
    

    What result would you expect to get from the following line?

    length(is.na(x))
    

    If you think the answer should be zero, that's wrong. The answer is 5. Look:

    is.na(x)
    #> [1] FALSE FALSE FALSE FALSE FALSE
    

    The line is.na(x) gives a logical vector of the same length as x. So the length of is.na(x) is 5. If you want the function that will give zero length it should be:

    length(which(is.na(x)))
    #> [1] 0
    

    However, that's not the sole reason why your code doesn't work as expected. The reason is that is.na is a generic function. This means that package developers who create a new class of object (such as a Raster in the package you are using) can write their own version of is.na to handle objects of that class appropriately.

    In the case of Raster, the authors of the package have indeed defined their own version of is.na. You can see it in the package's source code here. Unusually, instead of returning a logical TRUE or FALSE it returns the original Raster with the same dimensions, but with all of its values set to zero. Look:

    is.na(stage1)
    #> class      : RasterLayer 
    #> dimensions : 5, 5, 25  (nrow, ncol, ncell)
    #> resolution : 5, 5  (x, y)
    #> extent     : 0, 25, 0, 25  (xmin, xmax, ymin, ymax)
    #> crs        : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
    #> source     : memory
    #> names      : layer 
    #> values     : 0, 0  (min, max)
    

    Now, it so happens that the authors of Raster have defined a generic length method too, which is defined here. That means that the object returned by is.na(stage1) will have length 25.

    So if you want to demonstrate that there are no NA values in your raster, what you really want to do is replace

    length(is.na(stage1))
    

    with

    length(which(is.na(stage1[1:(size*size)])))
    #> [1] 0