Search code examples
scalaiteratorlist-comprehensionfor-comprehension

Need the best way to iterate a file returning batches of lines as XML


I'm looking for the best way to process a file in which, based on the contents, i combine certain lines into XML and return the XML.

e.g. Given

line 1
line 2
line 3
line 4
line 5

I may want the first call to return

<msg>line 1, line 2</msg>

and a subsequent call to return

<msg>line 5, line 4</msg>

skipping line 3 for uninteresting content and exhausting the input stream. (Note: the <msg> tags will always contain contiguous lines but the number and organization of those lines in the XML will vary.) If you'd like some criteria for choosing lines to include in a message, assume odd line #s combine with the following four lines, even line #s combine with the following two lines, mod(10) line #s combine with the following five lines, skip lines that start with '#'.

I was thinking I should implement this as an iterator so i can just do

<root>{ for (m <- messages(inputstream)) yield m }</root>

Is that reasonable? If so, how best to implement it? If not, how best to implement it? :)

Thanks


Solution

  • This answer provided my solution: How do you return an Iterator in Scala?

    I tried the following but there appears to be some sort of buffer issue and lines are skipped between calls to Log.next.

    class Log(filename:String) {
    
      val src = io.Source.fromFile(filename)
      var node:Node = null
    
      def iterator = new Iterator[Node] {
        def hasNext:Boolean = {
          for (line <- src.getLines()) {
            // ... do stuff ...
            if (null != node) return true
          }
          src.close()
          false
        }
    
      def next = node
    }
    

    There might be a more Scala-way to do it and i'd like to see it but this is my solution to move forward for now.