Search code examples
mysqldatabasedatabase-designdata-modelingkpi

Storing data counts from table into a "trending" table


We have a table for which we have to present many many counts for different combinations of fields. This takes quite a while to do on the fly and doesn't provide historical data, so I'm thinking in the best way to store those counts in another table, with a timestamp, so we can query them fast and get historical trends. For each count we need 4 pieces of information to identify it, and there are about 1000 different metrics we would like to store. I'm thinking on three different strategies, having a count and a timestamp but varying in how to identify the count for retrieval.

  1. 1 table with 4 fields to identify the count, the 4 fields wouldn't be normalized as they contain data from different external tables.
  2. 1 table with 1 "tag" field, which will contain the 4 pieces of information as a tag. This tags could be enriched and kept in another table maybe having a field for each tag part and linking them to the external tables.
  3. Different tables for the different groups of counts to be able to normalize on one or more fields, but this will need anywhere from 6 to tens of tables.

I'm going with the first one, not normalized at all, but wondering if anyone has a better or simpler way to store all this counts.

Sample of a value: status,installed,all,virtual,1234,01/05/2015

  • First field, status, can have up to 10 values
  • Second field, installed, can have up to 10 per different field 1
  • Third field, all,can have up to 10 different values, but they are the same for all categories
  • Fourth field, virtual, can have up to 30 values and will also be the same for all previous categories.
  • Last two fields will be a number and a timestamp

Thanks, Isaac


Solution

  • When you have a lot of metrics and you don't need to use them to do intra-metrics calculation you can go for the 1. solution.

    I would probably build a table like this

    Satus_id | Installed_id | All_id | Virtual_id | Date | Value
    

    Or if the combination of the first four columns have a proper name, I would probably create two tables (I think you refer to this possibility as the second solution with the 2):

    Metric Table
    Satus_id | Installed_id | All_id | Virtual_id | Metric_id | Metric_Name 
    
    Values Table 
    Metric_id | Date | Value
    

    This is good if you have names for your metrics or other details which otherwise you will need to duplicate for each combination with the first approach.

    In both cases it will be a bit complicated to do intra-rows operations using different metrics, for this reason this approach is suggested only for high level KPIs.

    Finally, because all possible combination for the last two fields are always present in you table you can think to convert them to a columns:

    Satus_id | Installed_id | Date | All1_Virtual1 | All1_Virtual2 | ... | All10_Virtua30
    

    With 10 values for All and 30 for Virtual you will have 300 columns, not very easy to handle, but they will be worth to have if you have to do something like:

    (All1_Virtual2 - All5_Virtual23) * All6_Virtual12
    

    But in these case I would prefer (if possible) to do the calculation in advance to reduce the number of columns.