This question is about memory allocation. I am using Array in this example because it replicates the behavior of another, more complex problem related to database code in a 3rd party library. I need to understand why memory allocation behavior changes in this way in a "node in container" environment.
I am running node.js (12.18.4, lts-stretch) in a Docker Desktop Community (2.3.0.5, Windows) container. The container has a limit of 2GB of memory. (However, I see the same behavior regardless of how much memory I assign to Docker.)
In node, this statement works as expected:
var a = Array(32 * 1024 * 1024).fill(0);
However, when this statement executes, node starts allocating memory without limit, as if it were stuck in an infinite loop:
var a = Array(32 * 1024 * 1024 + 1).fill(0);
I do not see the above behavior when running node.exe from a Windows PowerShell prompt -- only when running a node container (https://hub.docker.com/_/node).
Why does the memory allocation fail to work correctly at 32MB + 1 elements when running node in the container?
V8 developer here. In short: Mike 'Pomax' Kamermans' guess is spot on.
32 * 1024 * 1024 == 2**25 is the limit up to which new Array(n)
will allocate a contiguous ("C-like") backing store of length n
. Filling such a backing store with zeroes is relatively fast, and requires no further allocations.
With a longer length, V8 will create the array in "dictionary mode"; i.e. its backing store will be an (initially empty) dictionary. Filling this dictionary is slower, firstly because dictionary accesses are a bit slower, and secondly because the dictionary's backing store needs to be grown a couple of times, which means copying over all existing elements. Frankly, I'm surprised that the array stays in dictionary mode; in theory it should switch to flat-array mode when it reaches a certain density. Interestingly, when I run var a = new Array(32 * 1024 * 1024 + 1); for (var i = 0; i < a.length; i++) a[i] = 0;
then that's what happens. Looks like the implementation of fill
could be improved there; on the other hand I'm not sure how relevant this case is in practice...
Side notes:
at 32MB + 1 elements
we're not talking about 32 MB here. Each entry takes 32 or 64 bits (depending on platform and pointer compression), so it's either 128 or 256 MB for 2**25 entries.
node starts allocating memory without limit, as if it were stuck in an infinite loop
The operation actually does terminate after a while (about 7 seconds on my machine), and memory allocation peaks at a little over 900 MB. The reason is that if you actually use all entries, then dictionary mode is significantly less space efficient than a flat array backing store, because each entry needs to store its index, its attributes, and the value itself, plus dictionaries by their nature need some unused capacity to avoid overly many hash collisions.
I am using Array in this example because it replicates the behavior of another, more complex problem
Given how specific the behavior seen here is to arrays, I do wonder how accurately this simplified case reflects behavior you're seeing elsewhere. If the real code you have does not allocate and fill huge arrays, then whatever is going on there is probably something else.