Search code examples
scalatiffgeotiffgeotrellis

How to combine multiple TIFF's into one large Geotiff in scala?


I am working on a project for finding water depth and extent using digital ground model (DGM). I have multiple tiff files covering the area of interest and i want to combine them into a single tiff file for quick processing. How can i combine them using my own code below or any other methodology?

I have tried to concatenate the tiles bye getting them as an input one by one and then combining them but it throws GC error probably because there is something wrong with code itself. The code is provided below

import geotrellis.proj4._
import geotrellis.raster._
import geotrellis.raster.io.geotiff._
object waterdepth {
  val directories = List("data")

  //constants to differentiate which bands to use
  val R_BAND = 0
  val G_BAND = 1
  val NIR_BAND = 2

  // Path to our landsat band geotiffs.
  def bandPath(directory: String) = s"../biggis-landuse/radar_data/${directory}"

  def main(args: Array[String]): Unit = {
    directories.map(directory => generateMultibandGeoTiffFile(directory))
  }

  def generateMultibandGeoTiffFile(directory: String) = {
    val tiffFiles = new java.io.File(bandPath(directory)).listFiles.map(_.toString)

    val singleBandGeoTiffArray = tiffFiles.foldLeft(Array[SinglebandGeoTiff]())((acc, el:String) => {
      acc :+ SinglebandGeoTiff(el)
    })

    val tileArray = ArrayMultibandTile(singleBandGeoTiffArray.map(_.tile))

    println(s"Writing out $directory multispectral tif")
    MultibandGeoTiff(tileArray, singleBandGeoTiffArray(0).extent, singleBandGeoTiffArray(0).crs).write(s"data/$directory.tif")

it should be able to create a single tif file from all the seperate files but it throws up a memory error.


Solution

  • The idea you follow is correct, probably OOM happens since you're loading lot's of TIFFs into memory so it is not surprising. The solution is to allocate more memory for the JVM. However you can try this small optimization (that probably will work):

    import geotrellis.proj4._
    import geotrellis.raster._
    import geotrellis.raster.io.geotiff._
    import geotrellis.raster.io.geotiff.reader._
    
    import java.io.File
    
    def generateMultibandGeoTiffFile(directory: String) = {
      val tiffs =
        new File(bandPath(directory))
          .listFiles
          .map(_.toString)
          // streaming = true won't force all bytes to load into memory
          // only tiff metadata is fetched here
          .map(GeoTiffReader.readSingleband(_, streaming = true))
    
      val (extent, crs) = {
        val tiff = tiffs.head
        tiff.extent -> tiff.crs
      }
    
      // TIFF segments bytes fetch will start only during the write
      MultibandGeoTiff(
        MultibandTile(tiffs.map(_.tile)), 
        extent, crs
      ).write(s"data/$directory.tif")
    }
    

    }