Search code examples
mysqldatabasedatabase-normalization

What is the purpose of TYPE table like "User_type table"?


I'm not sure of the header question. Just to elaborate.

I have Table Users and have columns:

  1. Name (example values: "Meg", "Sam", etc.)
  2. Type (example values: "Admin", "Operator", "Maintainer")
  3. Image (example values: "12344.jpg", etc.)
  4. Etc.


I once told by someone that I must create a Table User_type with a seeded values.

  1. Admin [primary id: 1]
  2. Operator [primary id: 2]
  3. Maintainer [primary id: 3]

And change the column Type to Type_id and reference it to User_type Table. And instead of inserting string values ["Admin", "Operator", etc]. I now insert integer 1 for Admin, 2 for Operator, 3 for maintainer.


I don't remember his explanation of this. Now I'm confused.

  • Is this really essential in database matter?
  • Can somebody explain or elaborate me what is the logic of this?
  • String values vs Integer values, whats the matter?

Hope somebody can enlighten me, I am a beginner in database design.


Solution

  • You're right to question this requirement, since apparently your colleague didn't explain clearly why it was so important.

    The fact is there is no rule in database normalization that requires that we use a pseudokey, that is an integer like your example of Type_id.

    It's a good idea to create a lookup table of User_type so you have a concise list of the user types allowed by your system. If you add a new type, you only need to INSERT a row to that lookup table, and then the Users table that references the lookup table will be allowed to use the new type.

    But there's no reason dictated by rules of normalization that the reference has to be to an integer. You can make the primary key of that lookup table a string.

    CREATE TABLE User_type (
      Type VARCHAR(10) NOT NULL PRIMARY KEY
    );
    
    CREATE TABLE Users (
     ...
     Type VARCHAR(10) NOT NULL,
     FOREIGN KEY (Type) REFERENCES User_type(Type)
    );
    

    So why did your "someone" tell you that you have to use an integer Type_id?

    Well, there could be reasons, but they're not due to rules of normalization.

    • INTEGER may be smaller than the string stored in VARCHAR(10), so the data will take up less space. In MySQL, an INTEGER is four bytes. A VARCHAR is variable-length, depending on the string you store. For example "Admin" takes five bytes, plus one byte to encode the length of the string. So it's at least 50% larger than an INTEGER, and longer strings take even more space. But this is probably not worth optimizing, unless you store hundreds of millions of users.
    • Comparing integer to integer is slightly faster than comparing strings (especially if the string has a character set other than binary), but this is premature optimization. Are you going to run these comparisons so frequently that you need to shave off a few nanoseconds? Maybe, but you haven't said so.