Search code examples
javascriptnode.jsmemorylarge-filesbigdata

Javascript - handling 3gb of data for regular calculations


I am currently thinking about the layout of a feature for our app and since this is by far the heaviest part, I am not sure how to design it:

We want users to give the possibility to backtest their assumptions against a large dataset.

  • The dataset is over 3gb large (json)
  • It has to be preprocessed (format needs to be changed) before usable for calculations
  • It could be split in smaller files as the topics within the large dataset are separable.

In the beginning I thought I read the entire file in our store and work from there. I am just starting to doubt the memory will survive that (what kind of consumption to assume, running on aws).

The alternative I thought about is reading from the files directly, but then I would need to split them (250mb javascript max?) or stream them (good idea?).

Before I start writing code, I would like to understand how to approach this kind of massive (is it?) data most efficiently in Javascript. For now, everything is possible.

Thanks! Jan


Solution

  • I would move your data to a database and server. I would personally make a RESTful api that modifies the data in a database that you would call to within your app.

    Since you are writing in Javascript you could create a node server for a RESTful api, and store the data into a database like SQL or noSQL. Then take the data from the json file and put it into the database.

    I would suggest looking into all the different database types and using a database to manipulate and hold your data. The big question is relational vs non-relational databases. Read this to get more information on this: http://jlamere.github.io/databases/