Search code examples
sqlprecompute

guidance on precomputed SQL attributes


Often I deal with aggregate or parent entities which have attributes derived from their constituent or children members. For example:

  • The byte_count and packet_count of a TcpConnection object is computed from the same attributes of its two constituent TcpStream objects, which in turn are computed from their constituent TcpPacket objects.

  • An Invoices object might have a total which is basically the SUM() of its constituent InvoiceLineItems' prices, with a little freight, discount and tax logic thrown in.

When dealing with millions of packets or millions of invoiced line items (I wish!), on-demand computation of these derived attributes -- either in a VIEW or more commonly in presentation logic like reports or web interfaces -- is often unacceptably slow.

How do you decide, before performance concerns force your hand, whether to "promote" derived attributes to precomputed fields?


Solution

  • I personally wouldn't denormalize until performance trade-offs force my hand (because the downside of denormalizations are too drastic IMHO), but you might also consider:

    1. Convenience: e.g. if two different client apps want to calculate the same derived attributes, they both have to code up the queries to calculate them. Denormalization offers both client apps the derived attribute in a simpler way.
    2. Stability over time: e.g. if the formula for calculating a derived attribute is changeable, denormalization allows you to capture and store the derived value at a point in time so future calculations will never get it wrong
    3. Simpler queries: adding complexity to the DB structure can mean your Select query is simpler at the client end.
    4. Performance: Select queries on denormalized data can be quicker.

    Ref: The Database Programmer: The Argument for Denormalization. Be sure to read as well his article on Keeping Denormalized Values Correct - his recommendation is to use triggers. That brings home the kind of trade-off denormalization requires.