Search code examples
mysqldatabaserelational-databasenormalization

How to use PRIMARY KEY in mysql normalized database


The general form of a many-to-many relationship is to create three two-column tables (e.g. to relate favorite movies of members). Two tables are list of movies and members. In these two tables, we assign an auto_increment ID (which is primary key).Then storing relationship of these two IDs in the third table.

I wonder why we use ID for these two columns? We can have two one-column tables as lists of movies and persons; and then create the relationship table relating movies to persons as:

Fred   Gladiator
Brian  Godfather
Fred   Godfather

The only disadvantage of this method is that searching integer-only table is easier to find relationships. Instead we have reduced two indexed columns.

Any idea?


Solution

  • I wonder why we use ID for these two columns?

    I think it's usually cargo cult programming.

    • If you can't distinguish one Fred from another without an arbitrary, meaningless id number, then you can't distinguish one Fred from another with an arbitrary, meaningless id number.
    • If you anticipate changes to either of the names, then you should cascade updates. (Many Oracle developers insist that keys must be immutable. That's not because mutable, natural keys are "bad"; it's because Oracle doesn't support ON UPDATE CASCADE.)
    • There's a widespread--but incorrect--belief that id numbers are simply "faster", possibly because dbms engines are "optimized for joins on id numbers". Whether id numbers are faster in joins depends on the width of your tables, the content of the natural key, the number of rows, the number of joins required, page size, and the nature of your queries. What's often overlooked is that using natural keys usually reduces the number of joins. Sometimes, natural keys eliminate all the joins, even in tables that are in 5NF.

    See this recent SO answer for some measurements.

    Don't guess. Measure.