Search code examples
vbscript

Issue with System out of Memory in vbscript while reading a big file


I am trying to do some text replacement in vbscript. The problem is that the file is 150MB big. Below is the function which just deletes the header rows and gets rid of blank lines.

As I had suspected that it is giving me an error "out of memory" at line strContents = objTS.ReadAll in cleanHeaderRow sub. At this moment I am not sure if this task can be done in VBScript or not. But before I start exploring other languages any kind of suggestion would be much welcomed.

Sub cleanHeaderRow(browse)
    MsgBox browse
    Const FOR_READING = 1
    Const FOR_WRITING = 2
    'strFileName = "C:\scripts\test.txt"
    strFileName = browse
    iNumberOfLinesToDelete = 1

    Set objFS = CreateObject("Scripting.FileSystemObject")
    Set objTS = objFS.OpenTextFile(strFileName, FOR_READING)
    strContents = objTS.ReadAll
    objTS.Close

    arrLines = Split(strContents, vbNewLine)
    Set objTS = objFS.OpenTextFile(strFileName, FOR_WRITING)

    For i=0 To UBound(arrLines)
        If i > (iNumberOfLinesToDelete - 1) Then
            objTS.WriteLine arrLines(i)
        End If
    Next
End Sub

Sub DeleteBlankRows(browse)
    Const ForReading = 1
    Const ForWriting = 2
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objFile = objFSO.OpenTextFile(browse, ForReading)
    Do Until objFile.AtEndOfStream
        strLine = objFile.Readline
        strLine = Trim(strLine)
        If Len(strLine) > 0 Then
            strNewContents = strNewContents & strLine & vbCrLf
        End If
    Loop
    objFile.Close
    Set objFile = objFSO.OpenTextFile(browse, ForWriting)
    objFile.Write strNewContents
    objFile.Close
End Sub

Solution

  • Both your procedures read the entire file into memory, even though they're doing it in different ways. The correct approach to processing large files is to open the file, process it line by line, writing the output to a temporary file as you read the input, then replace the original file with the temp file.

    Example:

    filename = "..."
    
    Set fso = CreateObject("Scripting.FileSystemObject")
    
    Set inFile  = fso.OpenTextFile(filename)
    Set outFile = fso.OpenTextFile(filename & ".tmp", 2, True)
    Do Until inFile.AtEndOfStream
      If inFile.Line = 1 Then
        inFile.SkipLine                            'skip first line
      Else
        line = inFile.ReadLine
        If line <> "" Then outFile.WriteLine line  'write non-empty lines to output
      End If
    Loop
    inFile.Close
    outFile.Close
    
    fso.DeleteFile filename, True
    fso.MoveFile filename & ".tmp", filename
    

    That way you have only one line in memory at any given time. You trade memory usage for disk usage, though, since you create a second file for the output.