NOTE: I have never done this before:
What are some steps or documentation to help normalize tables/views in a database? Currently, there are several tables and views in a database that do not use primary/foreign key concept and sort of repeats same information in multiple tables.
I'd like to clean this up and also somewhat setup a process that would keep relationship updated. Example, if a person zipcode changes or record is removed then it automatically updates its relationship with other tables row/s.
NOTE:* My question is to normalize existing database tables. The tables are live so how do I approach normalization? Do I create a brand new database with table structure I want and then move data to that database? Once data moved, I plug in stored procedures and imports?
This question is somewhat broad, so I will only explain the concept.
Views are generally used for reporting/data presentation purposes and therefore I would not try to normalise them. Your case may be different.
You also need to be clear about primary / foreign key concept:
Lack of actual constraints (e.g. PRIMARY KEY
, FOREIGN KEY
) defined on the table does not mean that the tables do not have logical relationships on columns.
Data maintenance can be implemented in Triggers
.
If you really have a situation where a lot of highly de-normalised data exists in tables for no apparent reason and you want to normalise it then this problem can be approached in two ways:
Within "Gradual" re-factoring there are a few ways as well:
2.a. You take 1 old table and replace it with a new table and at the same time change all code that uses the old table to use the new table. For large systems this can be problematic as you simply may not be aware of all places that reference this table. On the other hand, it may be useful for situations where the table structure change is not significant and/or when the number of dependencies is small.
2.b. Another way is to create new table(s) (in the same database) in the shape / form you desire. The current tables should be replaced with View
s that return identical data (to old tables) but sourced from "new" tables. This approach removes / minimises the need to modify all dependencies immediately. The drawback is that the View that replaces the old table can become rather complex, especially if View Instead Of Trigger
s are needed to be implemented.