Search code examples
scalaakkaakka-stream

Akka stream from ftp, line by line


I'm trying to read file from ftp server using alpakka and scala streams. The type I get from the Ftp.fromPath(...) is Source[ByteString, Future[IOResult]]. I would want to read the file line by line(it's a CSV file), but I don know how.

I will be grateful for any help.


Solution

  • There is a standard way to split a Source[ByteString, _] by lines, called Framing.delimiter. It can be used like this:

    val source: Source[ByteString, Future[IOResult]] = Ftp.fromPath(...)
    
    val splitter = Framing.delimiter(
      ByteString("\n"),
      maximumFrameLength = 1024,
      allowTruncation = true
    )
    
    val result: Source[ByteString, Future[IOResult]] = source.via(splitter)
    

    The maximumFrameLength parameter determines the maximum length of a line; you can set it to Int.MaxValue to get an essentially unlimited line length (although it may be dangerous if your CSV lines are very long), and allowTruncation is set to true to allow the case when there is no new line at the end of your CSV file.

    The result source, when materialized, will produce ByteStrings corresponding to each line, without the newline character in them. If you expect your files to contain Windows line separators ("\r\n"), then you'll need to trim these strings manually.