Search code examples
vb.netfilestream

vb.net Filestream getting spaces after decoding bytes


I'm not sure what is happening. I don't think I changed the code at all, but for some reason I am getting spaces in between the returned characters after using the FileStream object to read the bytes of a file:

'Turn off Raise Events until after change is checked
        fsw.EnableRaisingEvents = False

        'read from current seek position to end of file
        Dim bytesRead(_maxBytes) As Byte


        Dim fs As New FileStream(_filename, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)

        If (fs.Length > _maxBytes) Then
            previousSeekPosition = fs.Length - _maxBytes
        End If

        previousSeekPosition = fs.Seek(previousSeekPosition, SeekOrigin.Begin)

        Dim numBytes = fs.Read(bytesRead, 0, _maxBytes)

        fs.Close()

        previousSeekPosition += numBytes

        Dim sb As New StringBuilder()
        For i = 0 To numBytes - 1
            sb.Append(bytesRead(i))
        Next

        'Raise the event to show data
        If Not blnFirstRun Then
            RaiseEvent MoreData(Me, Encoding.ASCII.GetString(bytesRead, 0, _maxBytes), _filename, _fileDescription)
        Else
            blnFirstRun = False
        End If

        'Check the changes against the alerts
        AlertChange(Encoding.ASCII.GetString(bytesRead, 0, _maxBytes))

        'Turn Raise Events back on
        fsw.EnableRaisingEvents = True

I have the _maxBytes set to 16384. I'm basically reading the file from the last known read location any time there is a file change (similar to what Linux Tail would do).

I tested it on a file and it appeared to work great. For some reason, though, it doesn't want to work anymore. I don't think I changed anything - but it now returns changes with spaces now.

For example:

I have a file that I have appended '9999' to. When I run the Encoding.ASCII.GetString routine, it shows up as: '9 9 9 9'.

I feel like I'm beating my head against a wall for something probably real simple. Hopefully someone knows the answer quick.


Solution

  • The fact that you are getting '9 9 9 9' when "9999" was written to the file suggests that whatever wrote to the file was using UTF-16 encoding, which uses a minimum of two bytes per character (ref: Wikipedia: Comparison of Unicode encodings).

    Examining the file with a hex editor should reveal if that is in fact the case.

    Please take note of the remarks in Encoding.Unicode Property just in case there is something that could cause a problem.