Search code examples
database-designdatabase-normalization

Second normal form question


I'm insecure about the way i'm thinking when I'm normalising. I'm designing a database for an fictional online pizza shop.

consider a table with a concatenated key being order_nr and pizza_article_nr.

I'm stuck with the pizza toppings. I'm thinking that taken litterally, they don't rely on the pizza since they technically speaking can exist on their own. Yet in reality they're always connected to a pizza. So then do they exist on their own so that I will deal with them in 3NF or does the column 'toppings' fail 2NF because it does rely on the pizza in practical reality?


Solution

  • The source of your confusion is that you are seeing keys in more than one place and you're thinking that it must be redundancy. The fact is that in normalization you need to ignore the psuedo-redundancy in the keys. This is not real redundancy but merely repetition of information. The repetition is there for a reason, namely to indicate the relationship between entities.

    If you have a table for toppings that are available, i.e. the primary key is topping_id, then a table that tells you which topping is on which pizza is 3NF. If you don't have a lookup table for toppings and instead put the topping name in your pizza composition table, then I think a lot of people would say you're violating 2NF. They would be right if topping names are not immutable. If the topping names happen to be immutable then there's an argument to say that the topping name is your primary key to an implicit topping table. However, as a matter of best practice, it's good to have meaningless keys in general - unless you can come up with a really good reason to use a meaningful key on a case by case basis. Therefore avoid using topping name in your pizza composition table.

    Since you can often order more than one pizza at a time (I cut code and have two teenage sons, so I speak from experience) your schema should probably be along these lines:

    ORDER:
      order_id (PK)
    , date_taken
    , deliver_to (or FK to a CUSTOMER table if you're being ambitious)
    
    PIZZA:
      pizza_id (PK)
    , order_id (FK)
    , size
    
    TOPPING:
      topping_id (PK)
    , topping_name
    
    PIZZA_COMPOSITION:
    , pizza_id (PK, FK)
    , topping_id (PK, FK)
    , quantity (My kids insist on double cheese)
    , coverage (One likes half plain cheese...)
    

    This schema is 3NF because the only thing that appears in more than one place is a foreign key.