Search code examples
phpmysqlrdbmsrelational

What's Better? Multiple Tables having same Entities vs Few Relation Tables having more Records


I'm creating a database on mysql for a small app. Problem is there are too many fields that are identical on different Tables like

Table 1: Muncipal Issues: ID, UserID, Title, Location, Description, ImageURL,

Table 2: Harrasement Issues: ID , UserID, Title, Location, Description, ImageURL

Tables 3 same as above

both tables have almost same coulmns. i want to ask if it's better to use a relations and create a table for handling IDs and link it with other details or it's better to create a single table with an extra coulmn for these issues.

on one hand there'll be too many tables with identical columns. on the other hand there'll few tables with too many rows in it.

What will be best for performance more rows or more tables.

i'm using Mysql.


Solution

  • Firstly, unless you expect millions of records don't care that much about performance but care more about the structure of your data and how easy it will be to access it. Literally write down a list of data that you plan to extract in your app e.g. "find all issues today", "find all unresolved issues older than 6 months" and then try to build real SQL queries on your expected structure. If they're going hard try to change the structure.

    To answer your question: it depends. The current structure has following benefits:

    • It's easy to query certain type of issues
    • It's easy to build a PHP application - just make one template form (or model) and then copypaste it with slight changes for other tables
    • In case of performance problems it may be easier to create a cluster by simply putting each table on the different db server.

    and following downsides:

    • It's inflexible. Adding new field that you forgot to add in the beginning will be painful since you'll have to change 3 (or more) tables and then the same amount of pieces in your app.
    • Adding new types of issues will be painful and require creating new table.
    • Creating SQL-s for getting data like "all non-resolved issues (regardless of type)" will require complicated UNION-s. Moreover this UNIONS will require creating virtual field with issue type otherwise you can't tell from which table did certain id come.

    The classical db approach recommends using one table for common fields and create derived tables for fields that are different. So:

    • issues table should have all common fields and is identified by PK issue_id
    • municipal_issues uses the foreign key to issues.issue_id and has only the specific fields
    • harassment_issues uses the foreign key to issues.issue_id and has only the specific fields
    • also the issues table has the issue_type field that takes values "harassment", "municipal" etc and helps finding the table where the additional data are stored.

    This pattern is called "Class Table inheritance" and you may check out the SQL antipatterns presentation for more info and other approaches. This solves the flexibility issue and still allows re-creating each of the original tables with only one simple JOIN that goes pretty fast.

    Also as a side note you may look into the db schema of bug-trackers like Mantis since this looks like the same domain.