I'm taking a class in database design. The curriculum focuses heavily on nomalization. The steps and methods speak for themselves, however I find the intuition aspect frustrating. Is there a mathematical way to handle data normalization, where one can assign the properties and crunch a mathematically certain conclusion?
I'll give another real world example in a discussion regarding addresses.
You are architect in charge of redesigning a legacy app with an old database that contains address information. The DB currently has 1 field for all address information.
Our address looks something like this:
Address
___________________________________
Bob Robertson 123 Broad Way Springfield IL 62701
Unacceptable! Let's normalize this a bit to remove some of that redundancy, so in our second pass, we break it up into a few fields:
Address1 Address2 City State Zip
________ ________ ____ _____ ____
Bob Robertson 123 Broad Way Springfield IL 62701
Well, you say, there are multiple addresses within a state, so we need to break that out into a lookup table, and we can save tons of space on that city name, and there is a one to many relationship between streetname and all of the numbers that exist on that street.
We also need to represent the 9 digit zip codes, and one 5 digit zip code has multiple 4 digit suffixes, so that needs it's own table too in order to fully normalize this relationship, boy am I clever!
Also, we might have multiple robertsons, and multiple Bobs that we ship things to, so we need a cross reference lookup table for that as well.
At this point, our table may look like this:
FirstNameID LastNameID StreetID StreetNum CityID StateProvinceID ZipID 4DigitSuffix
__________ _________ ________ ___________ ______ _______________ _____ ___________
3452 1257 45 234 990 32 123 1234
If you presented this to most rational developers, requiring 6 or more joins just to get address information, you might get run out of town on a rail. In many cases, there is an optimal middle ground that satisfies business needs and also speed/ease of development.