I would like to find the total line number of a text file ( > 60 GB) using the Windows command prompt.
I used:
findstr /R /N "^" file.txt | find /C ":"
But, the returned result is a negative number. Is it overflow? The file have not more than 5 billion lines. For an integer (4 Bytes), its max range is From −2,147,483,648 to 2,147,483,647. So, I need to design a script to count the number by dividing the result with 1000 ?
If yes, please help me with how to design the Windows batch file.
You could try a JScript solution. JavaScript numbers are always a 64-bit float data type, accurate up to 15 digits as integers. It'll take a while though. It takes me about 15 seconds to count the lines in a 100 meg XML file with this script.
Edit: Since the float datatype wasn't large enough, I modified the script to use an array as a counter, then output the result as a joined string. As long as fso.OpenTextFile().SkipLine()
doesn't choke (for which there is no solution but to try a different language, maybe Python or Perl?), this should work, and hopefully it won't be too expensive a hit on performance. I tested it on a 4.3 gig ISO file and it took about 8 minutes.
@if (@a==@b) @end /*
:: countlines.bat
:: usage: countlines.bat filetocount.log
:: batch portion does nothing remarkable
:: but relaunches itself with jscript interpreter
@echo off
cscript /nologo /e:jscript "%~f0" "%~f1"
goto :EOF
:: end of batch / begin JScript */
var fso, f, file = WSH.Arguments(0), longVal = [0],
ForReading = 1, ForWriting = 2, b4 = new Date();
// inherits global array longVal[]
// increments each element from right to left
function inc() {
for (var i=longVal.length - 1; i>=0; i--) {
if (++longVal[i] == 10) {
longVal[i] = 0;
if (!i) {
longVal.splice(0, 0, 0);
i++;
}
continue;
}
else break;
}
}
fso = new ActiveXObject("Scripting.FileSystemObject");
f = fso.OpenTextFile(file, ForReading);
while (!f.AtEndOfStream) {
f.SkipLine();
inc();
}
WSH.Echo(longVal.join(''));
f.Close();
var stopwatch = 'Line count completed in ' + ((new Date() - b4) / 1000.0) + 's';
WSH.StdErr.WriteLine(stopwatch);