I'm processing a large data stack (pm) through an unmarked predict function. After processing, I'm using paroutPred, SE, Lower, and Upper as templates to stick the Predicted, SE, lower, and upper data and stack those rasters for plotting.
My current code is below. The foreach loop seems to be running fine and I'm getting the necessary variables out. After all is said and done, compareRaster(rs, pm) result is TRUE. Only the values differ.
After executing plot(rs) the four rasters draw, but they're all jacked up as shown here: What is drawn should be occupancy probability for a species across a map of Wyoming.
I haven't figured out what's going wrong. I ran it sequentially in a for loop (3 days processing time) to confirm the raster output is wrong.
Does anyone have any insight to my issue? All help is greatly appreciated.
paroutPred<- pm[[1]]
paroutSE<- pm[[1]]
paroutLower<- pm[[1]]
paroutUpper<- pm[[1]]
paroutPred[]<- NA
paroutSE[]<- NA
paroutLower[]<- NA
paroutUpper[]<- NA
library(doSNOW)
nc<- detectCores()-1
cl<- makeCluster(nc);cl
registerDoSNOW(cl)
comb<- function(...){
mapply('rbind', ..., SIMPLIFY = F)
}
predictions<-
foreach(i = 1:nrow(pm), .combine = 'comb', .multicombine = T,
.maxcombine = 200, .packages = c("unmarked", "raster"), .verbose = T
)%dopar%{
test<- cellFromRow(pm, row=i)
# make into a data.frame for prediction
tmp<- data.frame(pm[test])
# test which are na
na<- sapply(1:nrow(tmp), FUN = function(x){any(is.na(tmp[x, ]))})
# deal with writing the data back to raster
if(length(which(na)) != nrow(tmp)){
# Predict the new data
pred<- predict(fmBest, "state", tmp)
}
list(test, na, pred)
}
stopCluster(cl)
predlist<- list(predictions[[3]][["Predicted"]], predictions[[3]][["SE"]], predictions[[3]][["upper"]], predictions[[3]][["lower"]])
pred<- do.call(cbind, lapply(predlist, data.frame))
names(pred)<- c("Predicted", "SE", "upper", "lower")
test<- predictions[[1]]
na<- data.frame(predictions[[2]])
paroutPred[test[!na]]<- pred$Predicted
names(paroutPred)<- "Predicted"
# Save prediction
paroutSE[test[!na]]<- pred$SE
names(paroutSE)<- "SE"
# Save prediction
paroutLower[test[!na]]<- pred$lower
names(paroutLower)<- "lower"
# Save prediction
paroutUpper[test[!na]]<- pred$upper
names(paroutUpper)<- "upper"
writeRaster(paroutPred, "Ppred_PEFA.tif", format = "GTiff", overwrite = TRUE)
writeRaster(paroutSE, "Pse_PEFA.tif", format = "GTiff", overwrite = TRUE)
writeRaster(paroutLower, "Plower_PEFA.tif", format = "GTiff", overwrite = TRUE)
writeRaster(paroutUpper, "Pupper_PEFA.tif", format = "GTiff", overwrite = TRUE)
Ppred.PEFA <- raster(paste(getwd(), "/Ppred_PEFA.tif", sep=""))
Pse.PEFA <- raster(paste(getwd(), "/Pse_PEFA.tif", sep=""))
Plower.PEFA <- raster(paste(getwd(), "/Plower_PEFA.tif", sep=""))
Pupper.PEFA <- raster(paste(getwd(), "/Pupper_PEFA.tif", sep=""))
rs<- stack(c("Ppred_PEFA.tif", "Pse_PEFA.tif", "Plower_PEFA.tif", "Pupper_PEFA.tif"))
plot(rs)
I got it done finally. I used the below code. Turns out I was trying to do things the long way, exporting variables then trying to manipulate them into raster form. Once I incorporated the raster templates and made those the output with a quick raster function, BOOM!
Rasters that draw properly out of a foreach loop.
Only catch is doing it this way occupies a LOT of RAM. With each raster template being 1.2Gb in size, and all four raster templates exported to each cluster, it adds up fast. On a machine with 128 Gb of RAM I'm maxing out memory to run this on 9 cores on Windows.
# Loop through the rows and columns, subset the data, make a prediction
paroutPred<- pm[[1]]
paroutSE<- pm[[1]]
paroutLower<- pm[[1]]
paroutUpper<- pm[[1]]
paroutPred[]<- NA
paroutSE[]<- NA
paroutLower[]<- NA
paroutUpper[]<- NA
library(doSNOW)
#Assemble cluster for parallel processing
nc<- detectCores()-8
cl<- makeCluster(nc);cl
registerDoSNOW(cl)
#Combine function to use in foreach loop w/multiple returns
comb<- function(...){
mapply('rbind', ..., SIMPLIFY = F)
}
#foreach loop for predictions
system.time(
predictions<-
foreach(i = 1:nrow(pm), .combine = 'comb', .multicombine = T,
.maxcombine = 200, .packages = c("unmarked", "raster"), .verbose = T
)%dopar%{
test<- cellFromRow(pm, i)
# make into a data.frame for prediction
tmp<- data.frame(pm[test])
# test which are na
na<- sapply(1:nrow(tmp), FUN = function(i){any(is.na(tmp[i, ]))})
# deal with writing the data back to raster
if(length(which(na)) != nrow(tmp)){
# # Predict the new data
pred<- predict(fmBest, "state", tmp)
}
#List of raster format non-raster objects to return
list(
paroutPred[test[!na]]<- pred$Predicted,
paroutSE[test[!na]]<- pred$SE,
paroutLower[test[!na]]<- pred$lower,
paroutUpper[test[!na]]<- pred$upper)
})
#Close the cluster
stopCluster(cl)
#Return data to raster form
paroutPred<- raster(predictions[[1]], template = paroutPred)
paroutSE<- raster(predictions[[2]], template = paroutSE)
paroutLower<- raster(predictions[[3]], template = paroutLower)
paroutUpper<- raster(predictions[[4]], template = paroutUpper)
Found a better way. Far less RAM use. Took the templates out and realized I can handle the raster formatting after the loop (duh).
#predict function of foreach loop here
}
#List of non-raster prediction results to return
list(pred$Predicted,
pred$SE,
pred$lower,
pred$upper)
}
#Return data to raster form
paroutPred<- raster(predictions[[1]], template = paroutPred)
paroutSE<- raster(predictions[[2]], template = paroutSE)
paroutLower<- raster(predictions[[3]], template = paroutLower)
paroutUpper<- raster(predictions[[4]], template = paroutUpper)