Search code examples
mysqldatabasedatabase-designquery-optimizationdatabase-optimization

Spreading/distributing an entity into multiple tables instead of a single on


Why would anyone distribute an entity (for example user) into multiple tables by doing something like:

user(user_id, username)
user_tel(user_id, tel_no)
user_addr(user_id, addr)
user_details(user_id, details)

Is there any speed-up bonus you get from this DB design? It's highly counter-intuitive, because it would seem that performing chained joins to retrieve data sounds immeasurably worse than using select projection..

Of course, if one performs other queries by making use only of the user_id and username, that's a speed-up, but is it worth it? So, where is the real advantage and what could be a compatible working scenario that's fit for such a DB design strategy?

LATER EDIT: in the details of this post, please assume a complete, unique entity, whose attributes do not vary in quantity (e.g. a car has only one color, not two, a user has only one username/social sec number/matriculation number/home address/email/etc.. that is, we're not dealing with a one to many relation, but with a 1-to-1, completely consistent description of an entity. In the example above, this is just the case where a single table has been "split" into as many tables as non-primary key columns it had.


Solution

  • By splitting the user in this way you have exactly 1 row in user per user, which links to 0-n rows each in user_tel, user_details, user_addr

    This in turn means that these can be considered optional, and/or each user may have more than one telephone number linked to them. All in all it's a more adaptable solution than hardcoding it so that users always have up to 1 address, up to 1 telephone number.

    The alternative method is to have i.e. user.telephone1 user.telephone2 etc., however this methodology goes against 3NF ( http://en.wikipedia.org/wiki/Third_normal_form ) - essentially you are introducing a lot of columns to store the same piece of information

    edit

    Based on the additional edit from OP, assuming that each user will have precisely 0 or 1 of each tel, address, details, and NEVER any more, then storing those pieces of information in separate tables is overkill. It would be more sensible to store within a single user table with columns user_id, username, tel_no, addr, details.

    If memory serves this is perfectly fine within 3NF though. You stated this is not about normal form, however if each piece of data is considered directly related to that specific user then it is fine to have it within the table.

    If you later expanded the table to have telephone1, telephone2 (for example) then that would violate 1NF. If you have duplicate fields (i.e. multiple users share an address, which is entirely plausible), then that violates 2NF which in turn violates 3NF

    This point about violating 2NF may well be why someone has done this.