Search code examples
node.jsv8

node.js / v8 Reading Large Files Into Memory


Question

How can you read files > 1.1 GB into memory under node.js?

Example

I'm trying to use topojson under node.js to convert > 1.1 GB GeoJSON files to TopoJSON format.

$ topojson -o outfile.json larger_than_one_point_one_GB_input_file.json

(the above has worked for files up to 517 MB)

Results in the following error

buffer.js:242
this.parent = new SlowBuffer(this.length);
                    ^
RangeError: length > kMaxLength
    at new Buffer (buffer.js:242:21)
    at Object.fs.readFileSync (fs.js:200:14)
    at /usr/local/share/npm/lib/node_modules/topojson/bin/topojson:61:26
    at Array.forEach (native)
    at Object.<anonymous> (/usr/local/share/npm/lib/node_modules/topojson/bin/topojson:60:8)
    at Module._compile (module.js:449:26)
    at Object.Module._extensions..js (module.js:467:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Module.runMain (module.js:492:10)

What I've Tried so Far

  • Extensive searching
  • Command line memory settings
    • --max-stack-size=2147000000
    • --max_executable_size=2000
    • --max_new_space_size=2097152
    • --max_old_space_size=2097152
  • Custom compile most recent v8 version into custom node.js install

Versions

  • node.js: v0.8.15
  • v8: 3.11.10.25

Solution

  • The problem is because topojson uses fs.readFileSync to read the entire file. What that does is open a buffer of size (length of file), then fill it up. But node buffers have a maximum size of 0x3FFFFFFF bytes, or 1GB - 1 byte. So you get that exception.

    Solution? Open the topojson source and replace readFileSync with streaming methods which wouldn't read the entire file as one block. Or if you're feeling really hackish, maybe recompile node with a larger kMaxLength constant...