Search code examples
sqlentity-frameworkentity-framework-coreforeign-keysrdbms

Foreign Key tables to hold data vs 'inlining' that data within the parent table? Performance?


I'm curious if there are any trade offs between creating a child table to hold a set of data compared to just placing all the data in the main table in the first place?

My scenario is that I have data that handles various metrics. Such as LastUpdated, and AmtOfXXX. I'm curious if it would be better to place all this data in a Table (specifically for metrics) and reference it by foreign key, or place all these fields directly in the main table and forego any foreign keys? Are there trade-offs? Performance considerations?

I'm referring to Relational Database Management Systems such as SQL Server and specifically I'll be using Entity Framework Core with MS SQL Server.


Solution

  • Your question appears to be more about the considerations between the two approach rather than asking which is specifically better. The latter is more an opinion. This addresses the former.

    The major advantage to having a separate table that is 1-1 is to isolate the metrics from other information about the entities. There is a name for this type of data model, vertical partitioning (or at least that's what it was called when I first learned about it).

    This has certain benefits:

    • The width of the data rows is smaller. So queries that only need the "real" data (or only the metrics) are faster.
    • The metrics are isolated. So adding new metrics does not require rewriting the "real" data.
    • A query such as select * on the "real" data only returns the real data.
    • Queries that modify only the metrics do not lock the "real" data.

    There might also be an edge case if you have lots of columns and they fit into two tables but not into one.

    Of course, there is overhead:

    • You need a JOIN to connect the two tables. (Although with the same primary key, the join will be quite fast).
    • Queries that modify both the "real" data and the metrics are more complicating, having to lock both tables.