Search code examples
node.jstypescripttypeorm

Memory problems in TypeORM, or me just filling it up


Have written an database init script in typescript using typeorm. And I seem to have cause a memory problem, but I can't figure out a way around it.

Currently the script imports three files Profiles (55 records) User (5306 records) Logins (1006909 records)

Rewrote the calls so in all cases, the script will create a JSON with all the updates, then use createQueryBuilder to execute the update as per below:

getConnection()
                                    .createQueryBuilder()
                                    .insert()
                                    .into(EULogin)
                                    .values(loginChunk)
                                    .execute()
                                    .catch(error => console.log(error))

Works a charm for the first ones, but when it comes to the last one (the 1,000,000 entires), it will not play ball. And I get a memory problem

LOGIN: Committed 196000/1006909 (Chunk Size:500) to database in 155 MS LOGIN: Committed 196500/1006909 (Chunk Size:500) to database in 823 MS

<--- Last few GCs --->

[60698:0x110008000] 34328 ms: Scavenge 1389.1 (1423.6) -> 1388.6 (1423.6) MB, 12.1 / 0.0 ms (average mu = 0.104, current mu = 0.099) allocation failure [60698:0x110008000] 34339 ms: Scavenge 1389.3 (1423.6) -> 1388.9 (1423.6) MB, 10.4 / 0.0 ms (average mu = 0.104, current mu = 0.099) allocation failure [60698:0x110008000] 34361 ms: Scavenge 1389.4 (1423.6) -> 1389.1 (1424.1) MB, 12.5 / 0.0 ms (average mu = 0.104, current mu = 0.099) allocation failure

<--- JS stacktrace --->

==== JS stack trace =========================================

0: ExitFrame [pc: 0x20ed7445be3d]
1: StubFrame [pc: 0x20ed7440d608]
2: StubFrame [pc: 0x20ed7502c4cc] Security context: 0x0fbfb411e6e9 <JSObject>
3: /* anonymous */(aka /* anonymous */) [0xfbf79c62a41] [/Users/bengtbjorkberg/Development/EUGrapherNode/node_modules/typeorm/query-builder/InsertQueryBuilder.js:348]

[bytecode=0xfbf93b851a1 offset=26](this=0x0fbfab4826f1 ,valueSet=0x0fbf8846ebc1 <Object map = 0xfbfbf7...

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory 1: 0x10003cf99 node::Abort() [/usr/local/bin/node] 2: 0x10003d1a3 node::OnFatalError(char const*, char const*) [/usr/local/bin/node] 3: 0x1001b7835 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node] 4: 0x100585682 v8::internal::Heap::FatalProcessOutOfMemory(char const*) [/usr/local/bin/node] 5: 0x100588155 v8::internal::Heap::CheckIneffectiveMarkCompact(unsigned long, double) [/usr/local/bin/node] 6: 0x100583fff v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/usr/local/bin/node] 7: 0x1005821d4 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/local/bin/node] 8: 0x10058ea6c v8::internal::Heap::AllocateRawWithLigthRetry(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [/usr/local/bin/node] 9: 0x10058eaef v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [/usr/local/bin/node] 10: 0x10055e434 v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationSpace) [/usr/local/bin/node] 11: 0x1007e6714 v8::internal::Runtime_AllocateInNewSpace(int, v8::internal::Object**, v8::internal::Isolate*) [/usr/local/bin/node] 12: 0x20ed7445be3d

Process finished with exit code 134 (interrupted by signal 6: SIGABRT)

I have tried to open the database in synchronous mode (not async) I have tried to split the last update in as small as 50 records at a time I even tried to open and close the database for each chunk, but that died because I could not get it to do so synchronously.

Opening the database with:

createConnection().then(connection => {

And below is the "Chunkloader"

.on('end', () => {
                            let compTime: number = new Date().getTime()
                            console.log("LOGIN: Entities read: " + loginReadCounter + " in " + new Date(compTime - loginStartTime).getMilliseconds() + " MS")
                            let currentChunk :number = 0;
                            let chunkSize : number = 500;
                            let loginChunk = [];
                            let loginStartChunkTime : number = compTime;
                            loginEntries.forEach(entry => {
                                loginChunk.push(entry);
                                currentChunk ++;
                                loginCommitCounter++;
                                if (currentChunk === chunkSize !! ){
                                    getConnection()
                                        .createQueryBuilder()
                                        .insert()
                                        .into(EULogin)
                                        .values(loginChunk)
                                        .execute()
                                        .catch(error => console.log(error))

                                    let compTime: number = new Date().getTime()
                                    console.log("LOGIN: Committed "  + loginCommitCounter + "/" + loginReadCounter + " (Chunk Size:" + loginChunk.length + ") to database in " + new Date(compTime - loginStartChunkTime).getMilliseconds() + " MS");
                                    currentChunk = 0;
                                    loginStartChunkTime = compTime;
                                    loginChunk = [];

                                }
                            });

Any ideas?

========================== EDIT FOLLOWING GOOD INPUT ====================

To try to sort my own head out, I moved it to a separate function, I got await to work, but how do I stop the process from continuing after the call. Because await works inside createConnection, but it does not work on createConnection, so the function will return straight away

function syncDataWrite(dbEntitiy, dataSet){
    console.log("DBLOADER Started for: " + dataSet.length);
    createConnection().then(async connection => {
        console.log("DBLOADER Connected!");
        const completion = await createQueryBuilder()
            .insert()
            .into(dbEntitiy)
            .values(dataSet)
            .execute()
            .catch(error => console.log(error))
            console.log("DBLOADER SQL uploaded")
    })
    // console.log(dbEntity);

}

Solution

  • Your code suggests that, while you chunk things, you are executing everything in parallel, which is probably not what you want.

    I'd recommend you rewrite this to properly put this in sequence.

    The easiest by far will be to switch to async/await and use a regular for loop. Assuming .execute() returns a promise, this would look something like this:

    const connection = getConnection();
    for (const entry of loginEntries) {
       // [snip]  
       await createQueryBuilder()
         .insert()
         .into(EULogin)
         .values(loginChunk)
         .execute()
       // [snip]
    }
    

    I stripped a bunch of your code, but I hope this general setup still makes sense.

    It's possible to do this without async/await, but it's going to look a whole lot more complicated.

    Edit based on your edit.

    Here's a rewritten version of syncDataWrite:

    async function syncDataWrite(dbEntitiy, dataSet){
    
        console.log("DBLOADER Started for: " + dataSet.length);
        const connection = await createConnection();
        console.log("DBLOADER Connected!");
        const completion = await createQueryBuilder()
           .insert()
           .into(dbEntitiy)
           .values(dataSet)
           .execute();
    
    }
    

    Note that if you use syncDataWrite multiple times, 1 for each chunk, you need to still await syncDataWrite when it's called.