I'm trying to develop a file-splitter method that splits a file into chunks of the desired size, it works perfect for files that has even filesize values (eg: if filesize is 2097152 bytes and I want to split it into two chunks, first chunk is 1048576 bytes and second chunk is 1048576 bytes),
the problem is when I try to split a file that when I divide its filesize it has decimals, for example I want to split a file of 8194321 bytes in two (or whatever) chunks, the half filesize is 4097160,5 bytes but as I need to use integers then I set chunk size to 4097161 bytes to create two chunks, the first chunk of 4097161 bytes and the second chunk of 4097160 bytes, but when I try split the file, when working the last chunk I get a System.ArgumentException
exception on this instruction:
outputStream.Write(buffer, bufferLength * bufferCount, tmpBufferLength)
with this error message:
Offset and length were out of bounds for the array or count is greater than the number of elements from index to the end of the source collection.
How I can fix my file-splitter method to properly split a file which has decimals when divided?
This is an usage example:
Split(sourceFile:=Me.fileToSplit,
chunkSize:=CInt(New FileInfo(fileToSplit).Length / 2),
chunkName:="File.Part",
chunkExt:="fs")
This the relevant code of the file-splitter procedure:
''' <summary>
''' Splits a file into manageable chunks.
''' </summary>
''' <param name="sourceFile">The file to split.</param>
''' <param name="chunkSize">The size per chunk.</param>
''' <param name="chunkName">The name formatting for chunks.</param>
''' <param name="chunkExt">The file-extension for chunks.</param>
Public Sub Split(ByVal sourceFile As String,
ByVal chunkSize As Integer,
ByVal chunkName As String,
ByVal chunkExt As String)
' FileInfo instance of the source file.
Dim fInfo As New FileInfo(sourceFile)
' The total filesize to split, in bytes.
Dim totalSize As Long = fInfo.Length
' The remaining size to calculate the percentage, in bytes.
Dim sizeRemaining As Long = totalSize
' Counts the length of the current chunk file to calculate the percentage, in bytes.
Dim sizeWritten As Long = 0L
' The buffer to read data and write the chunks.
Dim buffer As Byte() = New Byte() {}
' The buffer length.
Dim bufferLength As Integer = 524288 ' 512 Kb
' The total amount of chunks to create.
Dim chunkCount As Long = CLng(Math.Ceiling((fInfo.Length - bufferLength) / (chunkSize)))
' Keeps track of the current chunk.
Dim chunkIndex As Long = 0L
' A zero-filled string to enumerate the chunk parts.
Dim enumeration As String = String.Empty
' The chunks filename.
Dim chunkFilename As String = String.Empty
' Open the file to start reading bytes.
Using inputStream As New FileStream(fInfo.FullName, FileMode.Open)
Using binaryReader As New BinaryReader(inputStream)
While (inputStream.Position < inputStream.Length)
chunkIndex += 1L 'Increment the chunk file counter.
' Set chunk filename.
enumeration = New String("0"c, CStr(chunkCount).Length - CStr(chunkIndex).Length)
chunkFilename = String.Format("{0}.{1}.{2}", chunkName, enumeration & CStr(chunkIndex), chunkExt)
' Reset written byte-length counter.
sizeWritten = 0L
' Create the chunk file to Write the bytes.
Using outputStream As New FileStream(chunkFilename, FileMode.Create)
' Read until reached the end-bytes of the input file.
While (sizeWritten < chunkSize) AndAlso (inputStream.Position < inputStream.Length)
' Read bytes from the source file.
buffer = binaryReader.ReadBytes(chunkSize)
Dim bufferCount As Integer = 0
Dim tmpBufferLength As Integer = bufferLength
While (sizeWritten < chunkSize)
If (bufferLength + (bufferLength * bufferCount)) >= chunkSize Then
tmpBufferLength = chunkSize - ((bufferLength * bufferCount))
End If
' Write those bytes in the chunk file.
outputStream.Write(buffer, bufferLength * bufferCount, tmpBufferLength)
bufferCount += 1
' Increment the bytes-written counter.
sizeWritten += tmpBufferLength
' Decrease the bytes-remaining counter.
sizeRemaining -= tmpBufferLength
' Reset the temporal buffer length.
tmpBufferLength = bufferLength
End While
End While ' (sizeWritten < chunkSize) AndAlso (inputStream.Position < inputStream.Length)
outputStream.Flush()
End Using ' outputStream
End While ' inputStream.Position < inputStream.Length
End Using ' binaryReader
End Using ' inputStream
End Sub
EDIT: I forgot to mention that the While (sizeWritten < chunkSize)
block is because inside that block I trigger some events, instead of writting the entire buffer at once I use that while loop to "slowly" write the other buffer, this way I split files at exact size except for files with filesize that when divided has decimals, then throws that exception I mentioned.
You need to calculate the right amount to read before actually reading. Right now, you always read chunkSize
bytes but you are sometimes discarding the tail of that buffer.
I think the intention is that tmpBufferLength
has the correct buffer length. Assuming that works out (which I am too lazy to verify...) read exactly that amount from the source and then write the entire buffer to the destination.