So I was wondering how I might use scalaz-stream to generate the digest of a file using java.security.MessageDigest?
I would like to do this using a constant memory buffer size (for example 4KB). I think I understand how to start with reading the file, but I am struggling to understand how to:
1) call digest.update(buf)
for each 4KB which effectively is a side-effect on the Java MessageDigest instance, which I guess should happen inside the scalaz-stream framework.
2) finally call digest.digest()
to receive back the calculated digest from within the scalaz-stream framework some how?
I think I understand kinda how to start:
import scalaz.stream._
import java.security.MessageDigest
val f = "/a/b/myfile.bin"
val bufSize = 4096
val digest = MessageDigest.getInstance("SHA-256")
Process.constant(bufSize).toSource
.through(io.fileChunkR(f, bufSize))
But then I am stuck!
Any hints please? I guess it must also be possible to wrap the creation, update, retrieval (of actual digest calculatuon) and destruction of digest object in a scalaz-stream Sink or something, and then call .to()
passing in that Sink? Sorry if I am using the wrong terminology, I am completely new to using scalaz-stream. I have been through a few of the examples but am still struggling.
Since version 0.4 scalaz-stream contains processes to calculate digests. They are available in the hash
module and use java.security.MessageDigest
under the hood. Here is a minimal example how you could use them:
import scalaz.concurrent.Task
import scalaz.stream._
object Sha1Sum extends App {
val fileName = "testdata/celsius.txt"
val bufferSize = 4096
val sha1sum: Task[Option[String]] =
Process.constant(bufferSize)
.toSource
.through(io.fileChunkR(fileName, bufferSize))
.pipe(hash.sha1)
.map(sum => s"${sum.toHex} $fileName")
.runLast
sha1sum.run.foreach(println)
}
The update()
and digest()
calls are all contained inside the hash.sha1
Process1
.