Search code examples
database-normalization

Trying to have a normalized database of trading cards


I've been looking over data normalization and don't quite see how to apply it to the database of cards im setting up. Each card is from a set and that set has a symbol. Since those two things will be the same for each card in that set should that make up it's own table? Should all the cards be in one large table regardless of set?

Also, is there a reason not to just have cards from each set in a table named after the symbol and just put the name of the set in a column?

I'm a noob to this so if i need to be more specific or just shut face let me know. Thanks in advance.


Solution

  • It depends on what normal form you want.

    Consider a card table that looks like this, the primary key is (Name, Set):

    Card:
    | Name   | Set   | Symbol |
    ---------------------------
    | Card 1 | Set 1 | 1      |
    | Card 2 | Set 1 | 1      |
    | Card 3 | Set 1 | 1      |
    | Card 1 | Set 2 | 2      |
    | Card 2 | Set 2 | 2      |
    

    This table is in First Normal Form (1NF). However it is not in Second Normal Form (2NF), because the value of Symbol is dependant on the value of Set, and (Set) is a proper subset of the primary key.

    As you said in your question, if you want 2NF, you need to create a seperate Set table:

    Card:
    | Name   | Set   |
    ------------------
    | Card 1 | Set 1 |
    | Card 2 | Set 1 |
    | Card 3 | Set 1 |
    | Card 1 | Set 2 |
    | Card 2 | Set 2 |
    
    Set:
    | Name  | Symbol |
    ------------------
    | Set 1 | 1      |
    | Set 2 | 2      |
    

    These tables are now both in 2NF (and 3NF, BCNF, 4NF, ...).

    As to the second part of your question, the reason you would do this instead of having all the columns in one table are twofold:

    • To make changes easier. If you want to change the attributes of a set, you only have to update one row in the Set table, instead of updating all the cards in that set.
    • To reduce the possibility of data corruption. There is a real dependency between set and symbol, but the first table does not enforce this. This means the database will happily allow you to get into a corrupt state where cards in the same set have different symbols. In the second set of tables, this can't happen.