Search code examples
vb.netfor-loopfilestream

Parse HEX file Parrallel with Filestream


So i'm trying to speed up a For i as integer to get max preformace. I'm using more and more Parrallel and Async method in my code and it helps me alot. However currently i'm stuck with this one. I simply what to loop trhough a file an reading specific index position to see what's in the file so i can do certain things with it later on.

This is an example of the current For:

Using fs As New FileStream(PathToFile, FileMode.Open, FileAccess.Read, FileShare.Read)    
     For i As Long = 0 To fs.Length Step 1024

          'Go to the calculated index in the file
          fs.Seek(i, SeekOrigin.Begin)

          'Now get 24 bytes at the current index
          fs.Read(buffer, 0, 24)

          'Do some stuff with it
          List.Add(Buffer)
      Next
End Using

The files can be 15MB to 4GB in size. Currently i'm stuck at how to use a Step 1024 in a Parrallel.For and also how to approach it thread-safe. Hopefully someone could help me out with this.


Solution

  • This might be marked down, but if memory usage isn't a huge problem, then I would suggest reading the whole file and storing just those 24 byte sequences in an array of byte(23). Then using Parallel.For processing the array. On my pc the array takes up about 160mb for a 4gb. The speed of reading will of course depend on the system used. On my PC it take around 25 seconds.

    Try this..

    Imports System.Math
    Imports System.IO
    
    Public Class Form1
        'create the data array, with just 1 element to start with
        'so that it can be resized then you know how many 1024 byte
        'chunks you have in your file
        Dim DataArray(1)() As Byte
    
    
        Private Sub ReadByteSequences(pathtoFile As String)
            Using fs As New FileStream(pathtoFile, FileMode.Open, FileAccess.Read, FileShare.Read)
                'resize the array when you have the file size
                ReDim DataArray(CInt(Math.Floor(fs.Length) / 1024))
                For i As Long = 0 To fs.Length Step 1024
                    Dim buffer(23) As Byte
                    fs.Seek(i, SeekOrigin.Begin)
                    fs.Read(buffer, 0, 24)
                    'store each 24 byte sequence in order in the 
                    'DataArray for later processing
                    DataArray(CInt(Math.Floor(i / 1024))) = buffer
                Next
            End Using
        End Sub
    
        Private Sub ProcessDataArray()
            Parallel.For(0, DataArray.Length, Sub(i As Integer)
                                                  'do atuff in parallel
                                              End Sub)
        End Sub
    
    End Class