My NestJS application has a simple purpose to:
The most important part of my code consist of:
for(let file of FILES){
result = await this.processFile(file);
resultInsert += result;
}
and the function processFile()
async processFile(fileName: string): Promise<number> {
count = 0;
return new Promise((resolve, reject) => {
let s = fs
.createReadStream(BASE_PATH + fileName, {encoding: 'latin1'})
.pipe(es.split())
.pipe(
es
.mapSync(async (line: string) => {
count++;
console.log(line);
let line_splited = line.split("@");
let user = {
name: line_splited[0],
age: line_splited[1],
address: line_splited[2],
job: line_splited[3],
country: line_splited[4]
}
await this.userModel.updateOne(
user,
user,
{ upsert: true }
);
})
.on('end', () => {
resolve(count);
})
.on('error', err => {
reject(err);
})
);
});
}
The main problem is by the interaction of the ~9th file, I have a memory failure: Allocation failed - JavaScript heap out of memory. I saw that my problem is similar to Parsing huge logfiles in Node.js - read in line-by-line but the code still managed to fail.
I suspect the fact that I am opening a file, reading it and when I open another file, I am still inserting the previous one can cause the problem but I don't know how to handle it.
I could make it work by changing the updateOne() to insertMany().
Quick explanation: instead of inserting one by one, we would be inserting by 100k.
So I just created an array of user and when it reached 100k documents, we would insert with insertMany()