Search code examples
mysqldata-modelingdatabase-normalization

How to normalize table


I guess I should start first by explaining what the system actually does to get a better understanding.

The system is a payslip viewer website wherein the uploader(hr personnel) uploads a bunch of payslips in excel format and those entries get saved into the database. The payslip viewer then fetches the emp_id of the currently logged in user and displays all their earnings, deductions, info AND their total earnings(all earnings are added), total deductions(all deductions are added) and netpay(total earning-total deduction).

My problem is that my professor said I need to "normalize" the table but I was left wondering how could I possibly normalize this as I think this is as "normalized" as it gets. However, I did try to create two erds but I have a couple of questions:

  1. Why would I split earnings and deductions when the point of the system is just to read what the uploader gives?
  2. My professor suggested I get the deductions that are same for all of the employees but there is no such thing, the closest one I could think of is the insurance which is also changing depending on the dependencies and other factors that influence it.
  3. If I did try to create no.3 a problem would be in the event of a change in rates (i.e. Insurance 1 from February = $100 became $200 in March). Since the payslip viewer relies on that table, it would mean if a user would look at his payslip for the month of February, his insurance will also be $200.
  4. I'm creating this database to comply with the requirement of the system which is for the uploader to upload a spreadsheet of payslips, isn't that enough reason to justify my erd? Maybe if this was a payroll system I could set up other tables than can influence the output of the payslip BUT based on the system, the outputs are already done and the hr just needs to upload them to the database.

My first ERD: enter image description here

My second ERD: enter image description here

(I did this to try and normalize it but for the system's requirement, I don't think this is appropriate)


Solution

  • My guess is that your prof wants you to not have 30 deduction fields. Like... when you set up a table and find yourself doing:

    object_typa_1 | object_typa_2 | object_typa_3 | object_typa_4 | object_typeb_1 | object_typeb_2 ..
    

    And storing the values in their respective fields. Instead, you should normalize that in case some day you have to add deduction31 you don't have to do an ALTER TABLE and schluff around all your SQL to accomodate.

    Instead:

    Object Table:
    type | number | value
    

    In your case:

    employees:
    emp_id | password | name | ...
    
    payslips:
    id | emp_id | other payslip attributes | ...
    
    
    payslip_items:
    payslip_id | type  | number | value
    

    In the payslip_items table you stick in multiple records for each payslip. You tell it whether that item was an earning or a deduction in the type field. You say which earning or deduction it was in that number field... probably not the best name, but good for example. And then the value of that earning/deduction in the value field. So in reality your payslip_items table will be 34 records (30 deductions and 4 earnings) for each payslip...

    The reason you want to do it this way is that in the real world, as soon as you launch this thing for the client/business partner, they will want to add a new deduction or earning to the payslip (GUARANTEED). This schema works because you don't have to ALTER TABLE and mess with ALL of your sql that deals with payslip_items. It will grow and shrink as you need.