Search code examples
scalaarchivegzip

How to properly decompress gz archive in Scala


I a newbie in Scala and I have small task that requires me to decompress *.gz file from resources directory. So I want a proper way to do that to be able to parse file content after. Surely I have read some articles in the past, like: ONE TWO THREE

I can parse file content that is not archieved but cannot handle gz archive right now. Looks like I'm missing something small as I am newbie in Java and in Scala too.

Scala version - 2.21.0

I have a part of my code below:

object ResourceLoader {
    def loadResource(fileName: String): Try[InputStream] = Try(getClass.getResourceAsStream(fileName))


    def loadResource(fileName: String): Try[List[String]] =
        for {
            resourceStream <- loadResource(fileName)
            resourceContent = Source.fromInputStream(resourceStream).getLines.toList
        } yield resourceContent
}

Then I can iterate over not archieved file like:

        val content = ResourceLoader.loadResourceContent("/test_text.csv") recover {
            case e: FileNotFoundException => println(s"Requested file not found: $e")
            case e: SecurityException => println(s"Permission denied: $e")
            case e: Exception => println(s"An unknown exception occurred: $e")
        }
        content.foreach(println)

But cannot understand how to decompress gz archive first and then iterate over it.

I expect to use GZIPInputStream instead of getResourceAsStream in loadResource function but can't understand how to do that in proper way.

Thank you in advance for any help!


Solution

  • As @Luis commented, this is what you can do:

    
    val inputStream = Thread.currentThread().getContextClassLoader.getResourceAsStream("test_text.csv.gz")
    val gzipFileSource: BufferedSource = Source.fromInputStream(new GZIPInputStream(inputStream))
    
    println(gzipFileSource.getLines.toList.head)