Search code examples
f#fold

Sequence in F# folding triples


I've googled and read, and I'm trying to find a "correct" way to do it, but every question I read on SO seems to have completely different answers.

Here is the gist of my problem. files has the type signature of a seq of a triple (a:string, b:string,c:Int64). Being new to f# I'm still not fluent in expressing type signatures (or for that matter understanding them). a is a filename, b is an internal identifier, and c is a value representing the length (size) of the file. baseconfig is a string from earlier in the code.

ignore(files 
    |> Seq.filter( fun(x,y,z) ->  y = baseconfig)  // used to filter only files we want
    |> Seq.fold( fun f n   -> 
        if( (fun (_,_,z) -> z) n > 50L*1024L*1024L) then
            zipfilex.Add((fun (z:string, _, _) -> z) n)
            printfn("Adding 50mb to zip")
            zipfilex.CommitUpdate()
            zipfilex.BeginUpdate()
            ("","",0L)
        else
            zipfilex.Add((fun (z, _, _) -> z) n)
            ("", "", (fun (_, _, z:Int64) -> z) n + (fun (_, _, z:Int64) -> z) f)
    ) ("","",0L)
    )

What this chunk of code is supposed to do, is iterate through each file in files, add it to a zip archive (but not really, it just goes on a list to be committed later), and when the files exceed 50MB, commit the currently pending files to the zip archive. Adding a file is cheap, committing is expensive, so I try to mitigate the cost by batching it.

So far the code kinda works... Except for the ObjectDisposedException I got when it approached 150MB of committed files. But I'm not sure this is the right way to do such an operation. It feels like I'm using Seq.fold in a unconventional way, but yet, I don't know of a better way to do it.

Bonus question: Is there a better way to snipe values out of tuples? fst and snd only work for 2 valued tuples, and I realize you can define your own functions instead of inline them like I did, but it seems there should be a better way.

Update: My previous attempts at fold, I couldn't understand why I couldn't just use an Int64 as an accumulator. Turns out I was missing some critical parenthesis. Little simpler version below. Also eliminates all the crazy tuple extraction.

ignore(foundoldfiles 
    |> Seq.filter( fun (x,y,z) ->  y = baseconfig) 
    |> Seq.fold( fun (a) (f,g,j)   -> 
        zipfilex.Add( f)
        if( a > 50L*1024L*1024L) then
            printfn("Adding 50mb to zip")
            zipfilex.CommitUpdate()
            zipfilex.BeginUpdate()
            0L
        else
             a + j
    ) 0L
    )

Update 2: I'm going to have to go with an imperative solution, F# is somehow re-entering this block of code, after the zip file is closed in the statement which follows it. Which explains the ObjectDisposedException. No idea how that works or why.


Solution

  • I don't think your problem benefits from the use of fold. It's most useful when building immutable structures. My opinion, in this case, is that it makes what you're trying to do less clear. The imperative solution works nicely:

    let mutable a = 0L
    for (f, g, j) in foundoldfiles do
        if g = baseconfig then
            zipfilex.Add(f)
            if a > 50L * 1024L * 1024L then
                printfn "Adding 50mb to zip"
                zipfilex.CommitUpdate()
                zipfilex.BeginUpdate()
                a <- 0L
            else
                a <- a + j