Search code examples
windowsbatch-filewindows-7cmdfindstr

error of finding the total line number of a large text file using the Windows command prompt


I would like to find the total line number of a text file ( > 60 GB) using the Windows command prompt.

I used:

 findstr /R /N "^" file.txt | find /C ":"

But, the returned result is a negative number. Is it overflow? The file have not more than 5 billion lines. For an integer (4 Bytes), its max range is From −2,147,483,648 to 2,147,483,647. So, I need to design a script to count the number by dividing the result with 1000 ?

If yes, please help me with how to design the Windows batch file.


Solution

  • You could try a JScript solution. JavaScript numbers are always a 64-bit float data type, accurate up to 15 digits as integers. It'll take a while though. It takes me about 15 seconds to count the lines in a 100 meg XML file with this script.

    Edit: Since the float datatype wasn't large enough, I modified the script to use an array as a counter, then output the result as a joined string. As long as fso.OpenTextFile().SkipLine() doesn't choke (for which there is no solution but to try a different language, maybe Python or Perl?), this should work, and hopefully it won't be too expensive a hit on performance. I tested it on a 4.3 gig ISO file and it took about 8 minutes.

    @if (@a==@b) @end /*
    
    :: countlines.bat
    :: usage: countlines.bat filetocount.log
    
    :: batch portion does nothing remarkable
    :: but relaunches itself with jscript interpreter
    @echo off
    
    cscript /nologo /e:jscript "%~f0" "%~f1"
    
    goto :EOF
    
    :: end of batch / begin JScript */
    
    var fso, f, file = WSH.Arguments(0), longVal = [0],
    ForReading = 1, ForWriting = 2, b4 = new Date();
    
    // inherits global array longVal[]
    // increments each element from right to left
    function inc() {
        for (var i=longVal.length - 1; i>=0; i--) {
            if (++longVal[i] == 10) {
                longVal[i] = 0;
                if (!i) {
                    longVal.splice(0, 0, 0);
                    i++;
                }
                continue;
            }
            else break;
        }
    }
    
    fso = new ActiveXObject("Scripting.FileSystemObject");
    f = fso.OpenTextFile(file, ForReading);
    while (!f.AtEndOfStream) {
        f.SkipLine();
        inc();
    }
    WSH.Echo(longVal.join(''));
    f.Close();
    
    var stopwatch = 'Line count completed in ' + ((new Date() - b4) / 1000.0) + 's';
    WSH.StdErr.WriteLine(stopwatch);