Remove duplicates from a JSON file of huge size in NodeJs

I have a huge JSON file of size > 80000 MB containing 700,000,000 records. File content:

     {
        "rows": [
            {"empId":"1014456","blockId":"b6566"},
            {"empId":"1014456","blockId":"b6566"},
            {"empId":"1014457","blockId":"b6556"},
            {"empId":"1014458","blockId":"b6567"}
            ...
            ]
    }

I want to remove duplicates using empId as key. How do I do this in nodeJs? Do I need to use streams?

Solution

you can use lodash uniqby:

_.uniqBy([
            {"empId":"1014456","blockId":"b6566"},
            {"empId":"1014456","blockId":"b6566"},
            {"empId":"1014457","blockId":"b6556"},
            {"empId":"1014458","blockId":"b6567"}
            ...
            ], 'empId');

read more about it here: https://lodash.com/docs/4.17.15#uniqBy

AWS Lambda NodeJs unable to return response
updating firestore document multiple times from event trigger function overwrites or stops updating after few triggers
Does PM2 need to be a dependency of the project that it hosts?
Unable to import ESM module in Nestjs
Wait until file is created to read it in node.js
This module is declared with using 'export =', and can only be used with a default import when using the 'esModuleInterop' flag
How can you make a node server running on WSL2 forward over LAN so that it can be accessed by a phone?
How to limit bucket upload size in Backblaze B2 (S3 COMPATIBLE API)?
Is there really a way to send payment to another stripe account user?
Blocked by CORS policy despite domain name present in server side file
ts-node and ESM: unknown file extension '.ts'
EJS Rendering Error: "allChats is not defined" Despite Data in MongoDB
Is there any SDK for azure custom question answering in nodejs?
AWS Cloudwatch Alarm Issue
OAuth 2.0 Single Client Configuration for Web and Mobile Applications
sequelize error: missing index for constraint
throws an error while running my custom rules in ESlint 1:1 error Definition for rule 'D:/Sports/eslint-rules/no-log.js' was not found
Python analog code for crypto npm package (ENCODE/DECODE)
Getting undefined for file upload in node js using multer
Unexpected token '?' when run send.js or receive.js from rabbitMQ tutorial
Sequelize select * where attribute is NOT x
Error: secretOrPrivateKey must have a value
Can't run my Node.js Typescript project TypeError [ERR_UNKNOWN_FILE_EXTENSION]: Unknown file extension ".ts" for /app/src/App.ts
Error: Cannot find module 'next' in Docker / Next 14 / Node 20 / React 18
Retrieve last inserted id with Mysql
Accessing Firestore via Cloud Function
React, Node.js, jwt Login process. I can't stay the state with accessToken
How to publish a website made by Node.js to Github Pages?
Uncaught Error: Invalid date: 21/01/2023 in react using DateTIme Picker
When should I use (or not use) the keepAlive option of the Node.js http Agent?