Search code examples
sqlusage-statistics

Maintaining statistics over SQL tables


I have a few tables in a SQL database, and i want to have another table (say, Stats) that will hold some stats over those tables (averages, counts, etc.)

I could do updates on X each time one of my tables is modified, like this:

INSERT INTO Things VALUES (x, y, z);

UPDATE Stats
    SET AverageX = (SELECT avg(x) FROM Things WHERE Things.y="identifier")
    WHERE Stats.y="identifier";

But this seems to be sort of inefficient (averaging each time). Moreover, it is possible that Stats won't contain a record that matches Stats.y="identifier", so I might need to check this and possibly create a record first.

So, the question is: how do decent people do such stuff?


Solution

  • You can do this automatically by having triggers (insert, delete, and update) on all the tables where you are keeping data. You probably don't want to store the average in the statistics table. Instead, you want the sum and count.

    However, you should only be doing this if the following conditions are met:

    • You are referencing the statistics much more often than you are modifying the underlying tables.
    • The underlying tables are sufficiently large, so calculating the statistics on the fly is expensive.

    If these conditions are not met and your tables are relatively small, having a view that calculates the statistics on the fly might be sufficient.

    Note that some statistics can be very fast to calculate with indexes -- in particular MIN() and MAX().

    Maintaining triggers is non-trivial. And maintaining consistency across different tables is tricky (consider bulk insert or truncate table which do not execute triggers).