A different (less redundant) approach to Normalization

Let's say we are to normalize a database into 3rd normal form using the requirement:

I need a movie ticket registry program that can remember customers and the tickets that they've purchased.

We might end up with a database like this:

ticket
    id
    movie_name
    price

customer
    id
    first_name

However, when I look at this, for some reason it looks redundant. What if I were to break it up into even smaller pieces, like this:

name
    id
    name

customer
    id
    fk_name_id

ticket
    id
    fk_name_id
    price

Would this be a good approach? Is there a name for this approach?

Solution

As Jordan says, the point of breaking data out into a separate table is to avoid redundant data.

As you apparently realize, we do NOT want to lay out our tables like this:

WRONG!!!

ticket
  customer_name
  movie_name

That would mean that the customer_name is repeated for every movie he watches, and the movie name is repeated for every person who watches that movie. Lots and lots of redundant names. If the user has to type them in every time, it's likely that sometimes he mis-spells a name or uses a variation on a name, like we find our table includes "Star Wars", "Star Wars IV", "Star Wars Episode IV", and "Stra Wars", all for the same movie. All sorts of problems.

By breaking the customer and the movie out into separate tables, we eliminate all the redundancy. Great. Celebrate.

But if we take your suggestion of making a "name" table that holds both customer names and movie names, did we eliminate any redundancy?

If a customer has the same name as a movie -- if we happen to have a customer named "Anna Karenina" or "John Carter" or whatever (or maybe someone named their kid "Batman Returns" for that matter) -- are you going to use the same record to store both? If no, then you have not saved any redundancy. You have just forced us to do an extra join every time we read the tables.

If you do use the same record, it's even worse. What if you create a record for customer "Anna Karenina" and you share the id/name record with the movie. Then Anna gets married and now her name is "Anna Smith". If you update the name record, you have not only changed the name of the customer, but also the title of the movie! This would be a very bad thing.

You could, of course, say that if you change the name, that instead of updating in place you create a new record for the new name. But then that defeats half the purpose of breaking the names out to a separate table. Suppose when we originally created the movie record we mistyped the name as "Anna Karina". Now someone points out our mistake and we fix it. But with the "make a new record every time there's a change" logic, we'd have to fix each ticket sale one by one.

I guess you could ask the user if this is a change for just the movie title, just the customer name, or both. But now we've added another level of complexity. And for what? Our program is more complex, our queries are more complex, and our user interface is more complex. In exchange, we get a tiny gain in saving disk space for the rare case where a customer coincidentally has the same name as a movie title.

Not worth it.