I am trying to do some text replacement in vbscript. The problem is that the file is 150MB big. Below is the function which just deletes the header rows and gets rid of blank lines.
As I had suspected that it is giving me an error "out of memory" at line strContents = objTS.ReadAll
in cleanHeaderRow
sub. At this moment I am not sure if this task can be done in VBScript or not. But before I start exploring other languages any kind of suggestion would be much welcomed.
Sub cleanHeaderRow(browse)
MsgBox browse
Const FOR_READING = 1
Const FOR_WRITING = 2
'strFileName = "C:\scripts\test.txt"
strFileName = browse
iNumberOfLinesToDelete = 1
Set objFS = CreateObject("Scripting.FileSystemObject")
Set objTS = objFS.OpenTextFile(strFileName, FOR_READING)
strContents = objTS.ReadAll
objTS.Close
arrLines = Split(strContents, vbNewLine)
Set objTS = objFS.OpenTextFile(strFileName, FOR_WRITING)
For i=0 To UBound(arrLines)
If i > (iNumberOfLinesToDelete - 1) Then
objTS.WriteLine arrLines(i)
End If
Next
End Sub
Sub DeleteBlankRows(browse)
Const ForReading = 1
Const ForWriting = 2
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile(browse, ForReading)
Do Until objFile.AtEndOfStream
strLine = objFile.Readline
strLine = Trim(strLine)
If Len(strLine) > 0 Then
strNewContents = strNewContents & strLine & vbCrLf
End If
Loop
objFile.Close
Set objFile = objFSO.OpenTextFile(browse, ForWriting)
objFile.Write strNewContents
objFile.Close
End Sub
Both your procedures read the entire file into memory, even though they're doing it in different ways. The correct approach to processing large files is to open the file, process it line by line, writing the output to a temporary file as you read the input, then replace the original file with the temp file.
Example:
filename = "..."
Set fso = CreateObject("Scripting.FileSystemObject")
Set inFile = fso.OpenTextFile(filename)
Set outFile = fso.OpenTextFile(filename & ".tmp", 2, True)
Do Until inFile.AtEndOfStream
If inFile.Line = 1 Then
inFile.SkipLine 'skip first line
Else
line = inFile.ReadLine
If line <> "" Then outFile.WriteLine line 'write non-empty lines to output
End If
Loop
inFile.Close
outFile.Close
fso.DeleteFile filename, True
fso.MoveFile filename & ".tmp", filename
That way you have only one line in memory at any given time. You trade memory usage for disk usage, though, since you create a second file for the output.