Search code examples
databasepostgresqlrdbmsdatabase-normalizationdenormalization

Understanding Denormalization in Relational Databases?


Let's say I have tables like below in NF3

Table Name: Address

  • line1
  • line2
  • state_id

Table Name: State

  • id
  • name
  • country_id

Table Name: Country

  • id
  • name

Now I want to denormalize this table so that read will be faster and I don't need to JOIN multiple tables, but I couldn't able understand what is denormalization is.

Example 1) Do I need to keep tables like above in 3NF, and extra columns to Address like state_name, country_name so the resulting table looks like below.

Table Name Address

  • line1
  • line2
  • state_name
  • country_name
  • state_id

Or

Example 2) Do I need to remove state_id from the Address table and keep state_name and country_name.

Table Name Address

  • line1
  • line2
  • state_name
  • country_name

From the above 2 examples, which is denormalization?


Solution

  • Quoting Wikipedia:

    Denormalization is the process of trying to improve the read performance of a database, at the expense of losing some write performance, by adding redundant copies of data or by grouping data.

    So, by adding state_name to the Address table, you have denormalized the database.

    It doesn't matter whether you eliminate state_id or not. The simple act of added a column like state_name that has a lot of rows with the same value, makes the database denormalized.