Search code examples
sqldatabasedatabase-designrelational-databasedatabase-normalization

How to normalise a table which can link to one or more tables?


I was wondering if someone can help me with a normalisation issue I am facing, when trying to create a seperate communication table, for a personal project I am working on. Lets say I have the following tables and table columns:


Table A:

Table_A_Id, Date, Name, Agent_Id (FK), Notes

Agent Table:

Agent_Id, Date, Name, Phone, Email, Notes

Direct Contact Table:

D_C_Id, Date, Name, Phone, Email, Notes


Above is a sample of my actual tables, but describes my issue.

Essentially to normalise the above table to 1NF, I need to split the Notes column into its on seperate table, to help make the tables atomic, and comply with the 1NF rules. I want to split it out to a seperate column called Communication Table, which can allow me to keep track of when I made a conversation relating to the respective table and what I discussed.

I am not sure how to best to handle the split out Communication Table. because the Notes in Table A also links to an Agent from within the table, however the Notes column in the Agent table, only relates to communications done with the Agent.

So which of the following would you recommend is best for communication table:

EXAMPLE 1:

COMMUNICATION TABLE

Com_Id, Date, Link_Table, Link_Table_Id, Notes

  • Here the Link_Table should define what specific table is being dealt with and Link_Table_Id is the specific Id on the Link_Table

UPDATE:

Sorry I forgot, I need to also add another column called Communicated_With to the above, because there may be instances where communication related to Table A, could occur with another agent or person.

So the table would look like:

Com_Id, Date, Link_Table, Link_Table_Id, Communicated_With, Notes

OR

EXAMPLE 2:

COMMUNICATION TABLE

Com_Id, Date, Table_A_Id, Agent_Id, D_C_Id, Notes

  • Here all the unrelated Id columns would have to be set to null, whilst the related colum defined with an Id.

OR

EXAMPLE 3:

COMMUNICATION TABLE

Com_Id, Date, Notes

  • Here all other tables have a foreign key to this table. (However, this may force me to repeat the rest of the columns just to identify its a different column).

UPDATE 2

Thought of a new solution, after reviewing some youtube videos, I noticed that maybe a join table may work.

EXAMPLE 4

COMMUNICATION TABLE (Same as Example 1, without the "Comminicated_With" Column) Com_Id, Date, Link_Table, Link_Table_Id, Communicated_With, Notes

COMM_AGENT_JOIN TABLE com_Id, Agent_Id

  • I feel this may work, because it would allow me use the Link_Table, and Link_Table_Id to match the table being represented, and then finally the COMM_AGENT_JOIN_Id, can be used to specify which agent was communicated with, when the Link_Table is linked to Table_A.

Sorry if the answer is straight forward, I last did normalisation 2 years ago, and therefore have been struggling to do it.

Thank you.


Solution

  • This is not going to be an answer on how to put the tables into 1NF. I hope it will be helpful, though.

    When creating a database, we usually don't think in 1NF, 2NF etc. We think about what to model and what entities there are. When we think this through, then very often the database is already in 5NF or so. If in doubt we can use the NFs as a kind of checklist.

    I don't know your exact requirements, so there is a lot of guessing or just general advice here. Maybe one of your problems is that you are using the noun "notes" which doesn't describe exactly what this is about. Later you call this "correspondence", but are all "notes" = "correspondence"?

    Your database is about services you take from an agent or from a service company directly. So one entity that I see is this provider:

    provider

    • provider_id
    • name
    • contact_person_name
    • phone
    • email
    • type (agent or final service provider)

    If a provider can have multiple contacts, phones and emails, you'd make this a provider_contact table instead:

    provider

    • provider_id
    • name
    • type (agent or final service provider)

    provider_contact

    • provider_contact_id
    • name
    • phone
    • email
    • provider_id

    As to notes: well if there are notes on a provider ("Always ask for Jane; she's the most professional there.") or contact ("Jane has worked in London and Madrid."), I'd usually make this just one text column where you can enter whatever you like. You can even store HTML or a Word document. No need for multiple documents per provider, I suppose.

    Now there are also services. Either you need a list of who offers which service, then add these tables: service (which services exist) and provider_service (who provides which service). But maybe you can do without that, because you know what services exist and who provides them anyway and don't want to have this in your model.

    I don't know if you want to enter service enquiries or only already fixed service contracts. In any case you may want a service_contract table either with a status or without.

    service_contract

    • service_contract_id
    • provider_id (or maybe provider_contact_id?)
    • service (free text or a service_id referencing a service table)
    • valid_from
    • valid_to

    Here again you may have notes like "We are still waiting for the documents. Jane said, they will come in March 2018.", which again would be just one column.

    Then you said you want the correspondence, which could be an additional table service_contract_correspondance:

    service_contract_correspondance

    • service_contract_correspondance_id
    • service_contract_id
    • type (received from provider or sent to provider)
    • text

    But then again, maybe you can do without it. Will you ever have to access single correspondances (e.g. "give me all correspondences from December 2017" or "Delete all correspondances older than one year")? Or will there be thousands of correspondences on a single contract? Mabe not. Maybe you can again see this as one single document (text, HTML, Word, ...) and add this as a mere field to the table.

    Having a text consisting of multiple contacts like

    January 2, 2018 Jane
    She says they'll need to weeks to make an offer.
    
    January 20, 2018 
    I asked about the status. Talked with some Thomas Smith. He says they'll call us tomorrow.
    
    January 21, 2018 Jane
    She has emailed the contract for us to sign.
    

    is not per se against NF1. As long as you are not interested in details in your database (e.g. you'll never select all January correspondence alone on the contract), then to the database this is atomic; no need to change this.

    Same with contacts. If you have a string column only with "Jane (sales person) 123-45678 jane@them.com, Jack (assistent) 123-98765 jack@them.com", this is not per se against NF1 again. As long as you don't want to select names only or check phone numbers, but always treat the string as a whole, as the contact, then just do so.

    You see, it all boils down to what exactly you want to model and how you want to work with the data. Yes, there are agents and direct providers, but is the difference between the two so big that you need two different tables? Yes there is a chronology of correspondence per contract, but do you really need them separated into single correspondencies in your database?