Search code examples
mysqldatabasewordpressdatabase-designdatabase-normalization

Shouldn't the wordpress database be Normalized?


I just started learning the custom wordpress queries and came across the wordpress database. I was just astonished to see that the database wasn't normalized. I mean it'd be okay to use such a structure (i.e. having redundant data) in case when you haven't immense data but what if you have a large data. It would have less performance, wouldn't it? My question is:

Why would such a reputed product use an denormalized database? Are there any specific reasons behind?

enter image description here


Solution

  • The benefit of storing 'publish' as a string rather than a surrogate key reference to a lookup table is to avoid a JOIN to the lookup table.

    The example you give, storing the word 'publish' on multiple rows, is not itself denormalization. You may be confusing normalization with the use of surrogate keys. These are two independent things. Surrogate keys (auto-increment "id" primary key) has nothing to do with normalization.

    The reason to denormalize data is to give certain queries better performance.

    Any such optimization comes with risks, as you note.

    • You may have data anomalies because of the redundant data storage.
    • The size of storage may increase (though a single WordPress instance isn't likely to store a lot).
    • Different queries may have worse performance.

    The best practice for denormalization is to be certain you know which queries need to have best performance, and then to write code to handle the extra work to ensure data integrity when you have redundant storage.