I have a table where I need to add rows as follows:
id | empid | manager | page
------------------------------
1 | emp001 | mg001 | page1
2 | emp001 | mg001 | page2
3 | emp001 | mg002 | page1
I am confused whether to use the format shown above, or if I should use comma-separated values like
id | empid | manager | page
---------------------------------
1 | emp001 | mg001 | page1,page2,page3,page4.....
2 | emp001 | mg002 | page2,page10,page5,.....
If I go with option 1 I the number of rows keeps increasing and empid
and mangerid
can repeat. But if I go with option 2 then I it's not in a normalized form.
Which is better solution and why?
There is nothing wrong with increasing number of rows. Relational databases work best with normalized data and you can perform effective JOIN
operation between tables using indexes provided that you have created them.
Hence the approach presented in option 1 which is actually getting your data to 1NF (first normal form) is way better and will not come to bite you in the future just as the option 2 certainly would.
If in the future you might come up with an idea of analyzing employees and their managers per pages this is where the option 2 bites you.
As an additional note I think you could lower the storage required for your columns and stop repeating yourself by adding additional tables to store employee and manager strings and reference them only by an integer column. As for column pages I find it redundant to append the part "page" making the column values look like "pageX". Column name already tells you that it consist of page values so an integer column X would also suffice in this case.
I find the following schema to be a good start: