Let's say we are to normalize a database into 3rd normal form using the requirement:
I need a movie ticket registry program that can remember customers and the tickets that they've purchased.
We might end up with a database like this:
ticket
id
movie_name
price
customer
id
first_name
However, when I look at this, for some reason it looks redundant. What if I were to break it up into even smaller pieces, like this:
name
id
name
customer
id
fk_name_id
ticket
id
fk_name_id
price
Would this be a good approach? Is there a name for this approach?
As Jordan says, the point of breaking data out into a separate table is to avoid redundant data.
As you apparently realize, we do NOT want to lay out our tables like this:
WRONG!!!
ticket
customer_name
movie_name
That would mean that the customer_name is repeated for every movie he watches, and the movie name is repeated for every person who watches that movie. Lots and lots of redundant names. If the user has to type them in every time, it's likely that sometimes he mis-spells a name or uses a variation on a name, like we find our table includes "Star Wars", "Star Wars IV", "Star Wars Episode IV", and "Stra Wars", all for the same movie. All sorts of problems.
By breaking the customer and the movie out into separate tables, we eliminate all the redundancy. Great. Celebrate.
But if we take your suggestion of making a "name" table that holds both customer names and movie names, did we eliminate any redundancy?
If a customer has the same name as a movie -- if we happen to have a customer named "Anna Karenina" or "John Carter" or whatever (or maybe someone named their kid "Batman Returns" for that matter) -- are you going to use the same record to store both? If no, then you have not saved any redundancy. You have just forced us to do an extra join every time we read the tables.
If you do use the same record, it's even worse. What if you create a record for customer "Anna Karenina" and you share the id/name record with the movie. Then Anna gets married and now her name is "Anna Smith". If you update the name record, you have not only changed the name of the customer, but also the title of the movie! This would be a very bad thing.
You could, of course, say that if you change the name, that instead of updating in place you create a new record for the new name. But then that defeats half the purpose of breaking the names out to a separate table. Suppose when we originally created the movie record we mistyped the name as "Anna Karina". Now someone points out our mistake and we fix it. But with the "make a new record every time there's a change" logic, we'd have to fix each ticket sale one by one.
I guess you could ask the user if this is a change for just the movie title, just the customer name, or both. But now we've added another level of complexity. And for what? Our program is more complex, our queries are more complex, and our user interface is more complex. In exchange, we get a tiny gain in saving disk space for the rare case where a customer coincidentally has the same name as a movie title.
Not worth it.