Search code examples
excellarge-fileslarge-data-volumesvba

VBA for excel; editing very large files


I have a very large set of log files with AIS(shipping) data. Since these log files are about 200Mb per day, I'm trying to size them down for archiving. The files look like this:

244630075;under way ;128°'; 0.0kt;52.395290N;4.886883E;342.0°;511°;55s; 170418 000000;serial#1(A)[1]
244670835;under way ;128°'; 0.0kt;52.410140N;4.833700E;283.8°;511°;54s; 170418 000000;serial#1(B)[3]
244750830;under way ;128°'; 0.0kt;52.404563N;4.864063E;  0.0°;511°;55s; 170418 000000;serial#1(B)[1]
244900124;under way ;000°'; 7.1kt;52.426495N;4.780100E;279.4°;281°;56s; 170418 000000;serial#1(B)[2]
244670779;under way ;000°'; 0.0kt;52.420773N;4.801418E;330.9°;325°;58s; 170418 000000;serial#1(A)[1]
244660512;under way ;128°'; 0.0kt;52.402092N;4.781258E;268.3°;511°;54s; 170418 000000;serial#1(B)[1]
236202000;under way ;000°';11.7kt;52.477408N;4.462048E;285.4°;296°;55s; 170418 000000;serial#1(B)[1]
244690403;under way ;128°'; 0.0kt;52.400760N;4.891647E;  0.0°;511°;55s; 170418 000000;serial#1(A)[1]

This goes on for about 2 million lines per file. In order to size these files down, I want to remove every line containing " 0.0kt", since that represents information which is not usefull for me. In order to do so, I wrote a VBA script in Excel. I seem to have the script working for the major part. It runs through the file and edits out all lines containing the " 0.0kt". But when the script ends, and should save it exports an empty file.

This is my script:

Sub test()
'this will force the script to end when end of file is reached
On Error GoTo ASD

Const ForReading = 1
Const ForWriting = 2

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile("C:\x\170418.log", ForReading)

x = 1

Do
Do While i < 1000

        strline = objFile.ReadLine
         If InStr(strline, " 0.0kt") = 28 Then
            strline = "" & vbCrLf

        End If
    i = i + 1

Loop

'doevents and a calculation to call doevents after 1000 lines to prevent freezing of the script
DoEvents
a = a + 1
b = a * 1000
i = i + b
x = i / 1000
i = 0
iLineNumber = x

Loop

ASD:

objFile.Close

Set objFile = objFSO.OpenTextFile("C:\x\170418.log", ForWriting)
objFile.Write strline

objFile.Close

End Sub

What am I missing to save and close the file with all the lines containing " 0.0kt" removed, instead of all lines removed?

Thanks


Solution

  • Looking at your sample text, I think any line that contains ; 0.0kt; can be excluded.

    Using something I've already built, I've tweaked it to pick up your your file and use your DoEvents every 1000 rows.

    Sub Test()
    
        Dim ifileno As Integer, ofileno As Integer, rownum As Long
        Dim ifilename As String, ofilename As String, excludestring As String, strLine As String
    
        ifilename = "C:\Users\v.doynov\Desktop\nd.txt"
        ofilename = "C:\Users\v.doynov\Desktop\nd_output.txt"
        excludestring = "; 0.0kt;"
    
        ifileno = FreeFile
        Open ifilename For Input As ifileno
    
        ofileno = FreeFile
        Open ofilename For Output As ofileno
    
        rownum = 0
    
        Do Until EOF(ifileno)
            rownum = rownum + 1
            Line Input #ifileno, strLine
            If InStr(strLine, excludestring) = 0 Then Print #ofileno, strLine
            If rownum Mod 1000 = 0 Then DoEvents
        Loop
    
        Close ifileno
        Close ofileno
    
    End Sub