Search code examples
node.jsnosqlanalyticsamazon-dynamodb

How to quickly build large scale analytics server?


I need to build a analytics server for large scale (seven figures and up) quickly and for the cheap.

Piwik would be the easy choice but for what I've gathered so far, Piwik is rather hard to scale and can require rather hefty servers to handle loads.

My second idea would to create quick and dirty Node.js server which just pushes everything to Amazon DynamoDB, so that one can start gathering the data from the day one and then build the UI later on. That would be quick to create and scale (vertically and horizontally). However, I'm wondering if DynamoDB is the right choice for such use? (gather data, generate reports)


Solution

  • I'm using DynamoDB professionaly and would not use it for your application.

    DynamoDB truly has tons of constraints. Among them, you can have only one hash_key and optionally, one range_key.

    You may do some "analytics" for items grouped under a given hash_key using query but really nothing fancy. For complex queries, you would have to use scan or EMR which are slow and expensive and have a couple of drawbacks due to throttling.

    Nonetheless, NoSQL seems a good choice, at least for the prototyping stage of your application. But, I would recommend MongoDB instead. You can index any column, do complex queries, do not worry about data throttling. Sharding and replications is not too hard to setup.

    MongoDB has a strong ecosystem and community which DynamoDB has not (yet) as it is much younger. MongoDB also has hosted offers which would allow you to bootstrap your application as quickly as you would with DynamoDB.