Search code examples
ramazon-s3rasterr-raster

Working with Temp raster from s3 bucket acquired using s3read_using function


I'm pulling a .tif from an s3 bucket using the aws.s3 R package

test_tif <- s3read_using(FUN = raster, object = "test_tif.tif", bucket = "bucketname")

This is placing the raster in my Global Environment: test_tif

When i go to perform any sort of raster based operations, i get a repeated error

Error in .local(.Object, ...) : 

no further error codes or warnings

Looking at the structure of the raster, there is a nothing different compared with the same .tif read in from a local directory.

Only difference is one is saved as a temp file.

any ideas on how to work around this.

using s3read_using is a must, as this will eventually be incorporated into a shiny app.

Thanks.


Solution

  • What is see is that s3read_using downloads the file (with save_object, applies the function with the file as argument, and then deletes the file. That works if the function reads the data into memory. But the raster method only reads the metadata from the file; reading the actual values later, as needed.

    So if I do

    r <- s3read_using(FUN = raster, object = "test.tif", bucket = "bucketname")
    f <- filename(r)
    #"C:\\temp\\RtmpcbsI2z\\file9b846977650.tif"
    file.exists(f)
    #[1] FALSE
    

    The file is gone, and you cannot do anything with RasterLayer r.

    A work around could be to read all the values immediately. If that is not possible you could also multiply the values with 1. This would have a similar effect, unless files are very large, in which case it would create a (more) permanent temp file.

    rr <- s3read_using(FUN = function(f) readAll(raster(f)), object = "test.tif", bucket = "bucketname")
    # or 
    rr <- s3read_using(FUN = function(f) raster(f) * 1, object = "test.tif", bucket = "bucketname")
    

    But in that case you might as well use the save_object function --- which is what you want to avoid.

    Perhaps you can instead use Cloud Optimized GeoTiff and access them like this "vsicurl/https://mybucket/test.tif". You should be able to restrict access to your domain only. Also, the terra package might give you better performance than raster.