Search code examples
scalajava-streamscalazscalaz-streamfs2

Ideal chunk in scala fs2 stream performance gain in production


was wondering if the increase in chunk size in scala fs2 stream will give the performance gain?

    import cats.effect.{IO, Sync}
import fs2.{io, text}
import java.nio.file.Paths

def fahrenheitToCelsius(f: Double): Double =
  (f - 32.0) * (5.0/9.0)

def converter[F[_]](implicit F: Sync[F]): F[Unit] =
  io.file.readAll[F](Paths.get("testdata/fahrenheit.txt"), 4096)
    .through(text.utf8Decode)
    .through(text.lines)
    .filter(s => !s.trim.isEmpty && !s.startsWith("//"))
    .map(line => fahrenheitToCelsius(line.toDouble).toString)
    .intersperse("\n")
    .through(text.utf8Encode)
    .through(io.file.writeAll(Paths.get("testdata/celsius.txt")))
    .compile.drain

// at the end of the universe...
val u: Unit = converter[IO].unsafeRunSync()

Solution

  • This chunk size is just a size of the buffer used when you read the file content from the file system. So your question is equivalent to "will increasing buffer size when reading the file give performance gain?".

    This question is OS/hardware specific. Short answer for most cases - 4K is enough.