Do Node's child-process
have a limit ?
And how can I do to fetch (chunk by chunk) a HUGE command output like git show with a huge file ?
I try to parse results of git show <commit sha>
and there is a HUGE file (308 344 lines)
when running git show > showed_by_git.txt
i have the right output, with all the files, and a result of 118 667 lines
when running node's child-process
i only retrieve 32 602 lines ...
I simplified my code and used child-process
to count the number of chars and the number of lines
it resolves before stopping the flow.
the result shows that it stops at 32 000+ lines instead of expected 118 667
You can reproduce this at home if you have a repository with some HUUUGE file that has been commited recently
const childProcess = require('child_process')
function fetchCommand (command) {
return new Promise((resolve, reject) => {
const sub = childProcess.exec(command)
let chars = 0
let lines = 0
sub.stdout.on('data', function (chunk) {
chars += chunk.length
lines += chunk.split('\n').length
console.log('chars:' + chars + ' lines:' + lines)
// logs the char and line count on each chunk of data,
// then 'forgets' the data : no memory overloading
})
sub.stdout.on('close', function () {
console.log('CLOSED')
})
sub.stderr.on('error', function (err) {
console.log('ERROR: ' + err.message)
})
})
}
fetchCommand('git show').catch(err => console.log(err))
Here is the output :
C:\Users\guill\.code\git2stats>node examples/fetchTest.js
chars:4096 lines:126
chars:73728 lines:2117
chars:131072 lines:3772
chars:176128 lines:5176
chars:229376 lines:6560
chars:262144 lines:7663
chars:323584 lines:9171
chars:393216 lines:11304
chars:462848 lines:13483
chars:475136 lines:13849
chars:507904 lines:14916
chars:536576 lines:15839
chars:573440 lines:17028
chars:618496 lines:18484
chars:688128 lines:20539
chars:737280 lines:22000
chars:765952 lines:22930
chars:794624 lines:23860
chars:823296 lines:24794
chars:892928 lines:26976
chars:962560 lines:29104
chars:991232 lines:30003
chars:1032192 lines:31292
chars:1073152 lines:32602
CLOSED
You can see that it stops at 32 602 lines, whereas this particular git show
has 118 667 lines to show
I checked the last chunk of data to see if it did something special about the big file, but I can confirm it stops right in the middle of the file
I am writing a
git statistics tools,
this program is on a very good way
since I could parse git log --stat
then git show <commit sha>
for each commit
and return a satisfying json
This is some kind of Node foot-gun that I was previously unaware of.
Yes, there is a limit. It is configured with the maxBuffer
option (see docs), which can be set to Infinity if you like. The idea that this can be set to Infinity is not documented.
const sub = childProcess.exec(command, { maxBuffer: Infinity });
I am honestly shocked that this limit exists, and I will now be force to do a code review across a large body of code to find every place where the child_process
module is used and see if I need to add a maxBuffer
option.
Consider it an example of how to design an interface poorly.
You probably want to handle sub.on('exit', code => {})
or sub.on('close')
which is the same thing, so you can check the exit status of Git and raise an error if the status is not 0. Something like this:
sub.on('exit', code => {
if (code == 0) {
resolve(...);
} else {
reject(...);
}
});