I have the following SQL table for flows in a network which are defined by the timestamp, the nodes in the flow. Every flow has a volume.
| ID | timestamp | source | handover | nexthop | destination | volume |
|----|-----------|--------|----------|---------|-------------|--------|
| 01 | 123456788 | src001 | hnd 001 | nxt 002 | dst 001 | 1.23 |
| 02 | 123456789 | src002 | hnd 001 | nxt 002 | dst 003 | 4.32 |
| .. | ......... | ...... | ....... | ....... | ....... | .... |
At the moment the table consists of 26 million rows (around 2 months of data) and since there are entries for every flow in a per hour basis it will grow rapidly.
At the moment I am doing queries to get summed up flow volumes about arbitrary timeranges (e.g. 8 days) with a filtering on nodes like this:
SELECT source, handover, nexthop, destination, sum (volume)
WHERE timestamp >= ts_start AND timestamp <= ts_end
AND ( source = 123 OR source = 345 ) AND ( nexthop = 789 )
GROUP BY source, handover, nexthop, destination
As our database grows bigger and the need for speeding up our already slow queries I want to kind of partition the table into various time slices (e.g. 6 hours, 12 hours, 24 hours), but so that all these tables coexist.
So a query for the timerange from 07/17/16 10 PM to 07/19/16 6 AM would be splitted by timestamp and answered by the standard 1 hour table for the range 07/17/16 10 PM to 07/17/16 12 PM, then for 07/18/16 12 AM to 07/18/16 12 PM from the 24 hour table and the rest of it from the 6 hour table.
Is there a way in SQL to do the aggregating and splitting automatically as well as the answering of the queries so that I don't need to implement all the logic in my appplication?
Thanks in advance!
The short answer is: no, MySQL does not have such built-in functionality. MySQL does support table partitioning, however, you need to specify the exact ranges for each partition, since MySQL has to be able to clearly determine which record falls into which partition. A single record must be in a single partition.
What you are trying to achieve seems more like some data warehousing question, when you try to build various time dimension tables.
If you do not mind using myisam table engine instead of innodb, then you may consider using merge table engine:
The MERGE storage engine, also known as the MRG_MyISAM engine, is a collection of identical MyISAM tables that can be used as one. “Identical” means that all tables have identical column and index information.
This way you can have overlapping data (same record in multiple tables). If you know that you need data for a specific 6-hour period, then you can query just the single table. Or you can query the entire merge table if necessary.
Otherwise, you completely have to code your own solution.