Search code examples
sqlentity-relationshipdatabase-normalizationerd

SQL: Is this normalized? Extra field in an entity


I'm working on an ERD for a context where a therapist can make tests for clients to take, and there is a rule that after creating the test, the therapist can add items to it until the test has been taken by a user.

There is a:

TestSession entity

SessionID, 
TestID PK, 
etc etc

and a

Test entity

TestID
Taken t/f

There would a trigger that sets Test.Taken to true if a Session with the matching TestID is added. A group member made a case that Test.Taken is repetitive and not normalized, since the same information can be approximated by searching TestSession for the particular TestID as needed. In this case, normalization being 4NF or BCNF.

Does it more make sense to have a Test.Taken field that's updated by a trigger or to find the same info ad hoc from TestSession when it's needed?


Solution

  • I will try an answer from a practical point of view. Although normalization is important, I would design more starting from how your data is really used. Also, I will provide examples based on T-SQL, but they can easily be converted to any SQL flavor.

    1) not defining Test.Taken. To get entity information along with its usage you should write a query like the following:

    SELECT T.TestID, CAST((WHEN TS.TestID IS NOT NULL THEN 1 ELSE 0 END) AS BIT) AS Taken
    FROM Test T
        LEFT JOIN TestSession TS ON TS.TestID = T.TestID
    

    This can be embodied into a view if it is used in several places.

    To insert data in your structure, you simply make an INSERT into TestSession table.

    2) defining Test.Taken. To get the same information as above, the query becomes simpler and faster:

    SELECT T.TestID, T.Taken
    FROM Test T
    

    However, a session insertion will be slightly complex (trigger is not required, just use a transaction):

    BEGIN TRAN
    SET XACT_ABORT ON
    
    INSERT INTO TestSession (TestID)
    VALUES (@TestID)
    
    UPDATE Test SET Taken = 1 WHERE TestID = @TestID
    COMMIT
    

    If you do not handle a very large number of records (~10-100M or more), first approach can be used as it simplifies data persistence and reduces the risk of inconsistencies. For very larger structures where performance becomes a problem, redundancies may help.