Search code examples
f#cntk

F# Writing to file changes behavior on return type


I have the following function that convert csv files to a specific txt schema (expected by CNTKTextFormat Reader):

open System.IO
open FSharp.Data;
open Deedle;

let convert (inFileName : string) = 
    let data = Frame.ReadCsv(inFileName)
    let outFileName = inFileName.Substring(0, (inFileName.Length - 4)) + ".txt"
    use outFile = new StreamWriter(outFileName, false)
    data.Rows.Observations
    |> Seq.map(fun kvp ->
        let row = kvp.Value |> Series.observations |> Seq.map(fun (k,v) -> v) |> Seq.toList
        match row with
        | label::data ->
            let body = data |> List.map string |> String.concat " "
            outFile.WriteLine(sprintf "|labels %A |features %s" label body)
            printf "%A" label
        | _ ->
            failwith "Bad data."
    )
    |> ignore

Strangely, the output file is empty after running in the F# interactive panel and that printf yields no printing at all.

If I remove the ignore to make sure that there are actual rows being processed (evidenced by returning a seq of nulls), instead of an empty file I get:

val it : seq<unit> = Error: Cannot write to a closed TextWriter.

Before, I was declaring the StreamWriter using let and disposing it manually, but I also generated empty files or just a few lines (say 5 out of thousands).

What is happening here? Also, how to fix the file writing?


Solution

  • Seq.map returns a lazy sequence which is not evaluated until it is iterated over. You are not currently iterating over it within convert so no rows are processed. If you return a Seq<unit> and iterate over it outside convert, outFile will already be closed which is why you see the exception.

    You should use Seq.iter instead:

    data.Rows.Observations
        |> Seq.iter (fun kvp -> ...)