algorithm architecture system system-design

How to design a system in which we can query top results in last n hours

I was asked this question in an interview. The details were that assume we are getting millions of events. Each event has a timestamp and other details. The systems design requires ability to enable end user to query most frequent records in last 10 minutes or 9 hours or may be 3 months.

Event can be seen as following

event_type: {CRUD + Search}
event_info: xxx
timestamp : ts...

Solution

The easiest way to to figure out this is to look at how other stream processing or map reduce libraries do this (and I have feeling your interviewers have seen these libraries). Its basically real time map reduce (you can lookup how that works as well).

I will outline two techniques for event processing. In reality most companies need to do both.

New school Stream processing (real time)

Lets assume for now they don't want the actual events but the more likely case of aggregates (I think that was the intent of your question)

An example stream processing project is pipelinedb (they have how it works on the bottom of their home page).

Events go into use a queue/ring buffer
A worker process reads those events in batches and rolls them up into partial buckets or window.
Finally there is combiner or reducer which takes the micro batches and actually does the updating. An example would be event counts. Because we are using a queue from above events come in ordered and depending on the queue we might be able to have multiple consumers that do the combing operation.

So if you want minute counts you would do rollups per minute and only store the sum of the events for that minute. This turns out to be fairly small space wise so you can store this in memory.

If you wanted those counts for month or day or even year you would just add up all the minute count buckets.

Now there is of course a major problem with this technique. You need to know what aggregates and pivots you would like to collect a priori.

But you get extremely fast look up of results.

Old school data warehousing (partitioning) and Map Reduce (batch processed)

Now lets assume they do want the actual events for a certain time period. This is expensive because if you store all the events in one place the lookup and retrieval is difficult. But if you use the fact that time is hierarchal you can store the events in a tree of tuples.

Reasons you would want the actual events is because you are doing adhoc querying and are willing to wait for the queries to perform.

You need some sort of queue for the stream of events.
A worker reads the queue and partitions the events based on time. For example you would have a partition for a certain day. This is akin to sharding. Many storage systems have support for this (e.g postgres partitions).
When you want a certain number of events over a period you union the partitions.
The partitioning is essentially hierarchal (minutes < hours < days etc) which means you can do tree like operations on them.

There are certain ways to store such events which is called time series data such that the partitioning index is automatic and fast. These are called TSDBs of which you can google for more info.

An example TSDB product would be influxdb.

Now going back to the fact that time (or at least how humans represent it) is organized tree like we can we can preform parallelization operations. This because a tree is DAG (directed acyclic graph). With a DAG you can do some analysis and basically recursively operate on the branches (also known as fork/join).

An example generic parallel storage product would citusdb.

Now of course this method has a massive draw back. It is expensive! Even if you make it fast by increasing the number of nodes you will have to pay for those nodes (distributed shards). An in theory the performance should scale linearly but in practice this does not happen (I will save you the details).