Search code examples
.netvb.netvisual-studiostreamreader

Reading line by line through text file taking abnormally long time


I have an issue where I need to read through a text file line by line and place each line into either one string or another if certain criteria is met. The issue I am having is that it is taking a really long time and I'm just wondering is there a quicker way of doing things. I have done a lot of research on how to do this and this is the best I can come up with. Thanks. (appending to two strings every time due having to output both strings to text files straight after).

The contents are in one huge text file where one piece of information begins at a line starting with "aaa". I have to look through the text file seperating these pieces of information by looking for when lines begin with "aaa".The criteria that separates one piece of information from either fullStr1 or fullStr2 is that character at index 29 is either a blank space(" ") or not. Thanks.

        Using reader As StreamReader = New StreamReader(file)
            Dim line As String = reader.ReadLine
            Do While (Not line Is Nothing)
                If line.Substring(0, 3) = "aaa" AndAlso line.Substring(29, 1) <> " " Then
                    Do
                        fullStr1 = fullStr1 & line & vbCrLf
                        line = reader.ReadLine
                    Loop While (Not line Is Nothing AndAlso line.Substring(0, 3) <> "aaa")
                ElseIf line.Substring(0, 3) = "aaa" AndAlso line.Substring(29, 1) = " " Then
                    Do
                        fullStr2 = fullStr2 & line & vbCrLf
                        line = reader.ReadLine
                    Loop While (Not line Is Nothing AndAlso line.Substring(0, 3) <> "aaa")
                End If
            Loop
        End Using

Solution

  • A very quick and easy solution is using StringBuilder type instead of String type for variables line and fullStr. (see https://msdn.microsoft.com/en-us/library/ms172824.aspx). Strings are immutable, which means every time you assign a value to line or fullStr variables, you are not really updating the value of variable that you have in the memory, instead you scrap the previous allocated memory and allocate a new memory space for the variable and assign the new value to the new memory space. This is a lot of overhead and affects the performance of your application.