Search code examples
relational-databaserelationdatabase-normalization

Decomposing relations to Fourth Normal Form


Disclosure: I am taking Stanford's online database course. The forum there is dead, and I'm hoping for some help on SO.

Here's the quiz question:

Consider relation R(A,B,C,D,E) with multivalued dependencies:

A -» B, B -» D

and no functional dependencies. Suppose we decompose R into 4th Normal Form. Depending on the order in which we deal with 4NF violations, we can get different final decompositions. Which one of the following relation schemas could be in the final 4NF decomposition?

And here is my thinking:

Since we are given that there are no functional dependencies, the only key is set of attributes (A,B,C,D,E). In other words, both multivalued dependencies in the question are violating, and we must decompose them.

I am following the decomposition algorithm given in lecture:

Compute keys for R [done]

Repeat until all relations are in 4NF

Pick any R' with nontrivial A -» B that violates 4NF
Decompose R' into R_1(A, B) and R_2(A, rest)
Compute functional dependencies and multivalued dependencies for R_1 and R_2
Compute keys for R_1 and R_2

I see two ways to decompose the relations: start with A -» B or B -» D.

Starting with A -» B

R(A,B,C,D,E)
      |
      +-----------+
      |           |
 R_1(A,B)  R_2(A,C,D,E)

Since B and D are no longer in the same relation, we do not have a 4NF violation, and we're done. I'm not sure how to compute the FDs, MVDs, and keys at this point.

Starting with B -» D

R(A,B,C,D,E)
      |
      +-----------+
      |           |
 R_1(B,D)  R_2(B,A,C,E)
                  |
                  +----------+
                  |          |
             R_3(A,B)  R_4(A,C,E)

At this point, (A and B) and (B and D) are decomposed into their own relations, so we have no violations, and we're done.

The answer choices:

At this point, I'm completely stumped. I do not see any of the relations in the answer choices, nor can I come up with an idea that will get me there:

  1. CE
  2. AD
  3. AE
  4. ABD

I don't need the answer outright, but what am I missing?


Solution

  • A correct answer is AD.

    How is this obtained?

    Consider that, like for functional dependencies, you can have multivalued dependencies implied by other multivalued dependencies. For instance, there is a pseudo-transitivity rule (or multi-valued transitivity rules) that says:

    If X →→ Y holds, and Y →→ Z holds, then X →→ Z − Y holds

    For this rule, from A →→ B and B →→ D you can derive A →→ D. So, if you decompose the relation in 4NF you could start from this dependency, and get a table with attributes AD. Or, alternatively, in your first decomposition, after finding R_1(A,B) and R_2(A,C,D,E), you should continue to decompose R_2, since it still contain the non-trivial MVD A →→ D, to find R_3(A, D) and R_3(A, C, E).