I am processing large amounts of lidar data (>4TB worth) and I am running a height normalisation function on a lascatalog
using the LidR
package. I have used job::job()
to run this as a background job.
My code is as follows (please note that this is not a lidR
question):
las_dir <- "path/to/.las/files/"
las_cat <- readLAScatalog(las_dir, filter = "-drop_overlap -drop_class 6 7 9 13 14 15 16 17 18 0 -keep_random_fraction 0.1")
# Creating height normalise function using k-nearest neighbour inverse distance weighting
norm_height_knnidw <- function(chunk) {
las <- readLAS(chunk)
if (lidR::is.empty(las)) return(NULL)
las <- normalize_height(las, algorithm = knnidw())
}
# Defining las catalog parameters for height normalisation
opt_chunk_size(las_cat) <- 250
opt_chunk_buffer(las_cat) <- 5
opt_output_files(las_cat) <- paste0(tempdir(), "{XLEFT}_normed", overwrite = TRUE)
opt_stop_early(las_cat) <- FALSE
opt <- list(automerge = TRUE)
job::job(las_normed = {
options(mc.cores = 16) #on a high powered computer with 24 cores
# Running height normalisation function on las catalog
catalog_apply(las_cat, norm_height_knnidw, .options = opt)
})
This ran fine (took a total of 5 days and 9 hours) but the output is an environment which, when I try the next step in my workflow, won't interact with it:
> class(las_normed)
[1] "environment"
> str(las_normed)
<environment: 0x00000293a7b4c538>
# starting to define chunk options for next lascatalog process
> opt_output_files(las_normed) <- paste0(las_output, "{*}_dtm_pitfree", overwrite = TRUE)
Error in ctg@chunk_options :
no applicable method for `@` applied to an object of class "environment"
Is there a way I can convert this to a useable variable? Or is it a step I have missed in the initial setting up of the job
? I have tried reading through job
documentation here but I am struggling to understand quite where I have gone wrong/how to interact with the output. I have also gone through Chapter 7: Environments in Advanced R by Hadley Wickham but I am still struggling to understand how to use the environment output (I fully acknowledge that limited understanding of the environment object is likely the cause of this so any direction to more advice on them is very welcome).
You didn't make any assignments in the job::job
call, so no variables were saved to the las_normed
environment (other than .jobcode
(and .Random.seed
if catalog_apply
used random numbers)). I'm guessing what you want is something like:
job::job(las_normed = {
options(mc.cores = 16) #on a high powered computer with 24 cores
# Running height normalisation function on las catalog
res = catalog_apply(las_cat, norm_height_knnidw, .options = opt)
})
When it completes, the las_normed
environment will contain the object res
.
I'll risk stating the obvious and recommend you experiment with job::job
using a less expensive operation until you are confident you know how it behaves before using it to perform a multi-day operation.