Search code examples
normalizationdatabase-normalizationthird-normal-form

Normalization - 2NF and 3NF


https://dba.stackexchange.com/questions/98427/normilsation-2nf-and-3nf

I've been through several questions and youtube tutorials; I understand that 2NF is removing the partial dependencies and 3NF is the transitive ones, but I can't get my head around how the following example should look like in 2NF.

studentID | studentName | courseCode | courseTitle | modCode | modTitle | credits | resultCode

My attempt is the following for 2NF:

Student

studentID | studentName | courseCode | modCode | resultCode

Course

courseCode | courseTitle

Module

modCode | modTitle | credits

Is this correct? If not, where am I going wrong and why.


The following is the 3NF:

Student

studentID | studentName | courseCode

Course

courseCode | courseTitle

Module

modCode | modTitle | credits | courseCode

Results

studentID | modCode | resultCode

Same goes for this; is this correct - if not where and why?


Solution

  • Ok, let's have a look at your 2NF attempt:

    Student

    studentID | studentName | courseCode | modCode | resultCode

    Course

    courseCode | courseTitle

    Module

    modCode | modTitle | credits

    Let's talk about Student first.

    Your key cannot be studentID, because the resultCode depends on studentID and courseCode (you have one result for every course). But your studentName depends only on studentID, so a part of the key (studentID,courseCode). So 2NF is violated. You need to do something like this:

    Student

    studentID | studentName | courseCode | modCode

    Course

    courseCode | courseTitle

    Module

    modCode | modTitle | credits

    Result

    studentID | courseCode | resultCode

    But please keep in mind, that this respects the 2NF, but still seems not correct, because modules and courses are now completely unrelated. So try this:

    Student

    studentID | studentName | modCode

    Module

    modCode | modTitle | credits

    Course

    courseCode | courseTitle | modCode

    Result

    studentID | courseCode | resultCode

    A course belongs to a module (a module has many courses) - from my point of view. This is automatically in 2NF (because the key is always just one attribute (except for the result, but there is just one non-key-attribute always depending on both key attributes), so nothing can depend on a "part" of the key). And it is in 3NF, too, because every "physical" entity has it's logical representation in the data model (more a rule of thumb than a formalism).


    Now let's have a look at your 3NF attempt. I think you've got the module<->course dependency the wrong way round, but let's just concentrate on the normalization.

    Student

    studentID | studentName | courseCode

    Course

    courseCode | courseTitle

    Module

    modCode | modTitle | credits | courseCode

    Results

    studentID | modCode | resultCode

    This is correct 3NF, because there is simply no other key candidate than the key itself. So there can't be any transitive dependency.

    To clarify this: a key candidate is one of the many minimal sets of possible keys. In your relation you have found at least one key candidate with one element (except for the results relation). So any other key candidate cannot have more than one element. That means, that you can simply look at every single attribute and decide "can this be a key or not?" And in your example you find, that no other attribute can be a key - so it's automatically in 2NF and 3NF.