Search code examples
databasedatabase-normalization

4NF, Multivalued Dependencies without Functional Dependencies


We have a relation R(A,B,C,D,E) with multivalued dependencies:

A->>B
B->>D

Relation R doesn't have any functional dependencies.

Suppose we decompose R into 4NF.

Since we don't have any functional dependencies, the only key is all attributes (A,B,C,D,E). There are two ways we can decompose R:

R1(A,B) R2(A,C,D,E) 
R3(B,D) R4(A,B,C,E)

These 2 decompositions are final since there are no nontrivial multivalued dependencies left,

What are problems with my reasoning?


Solution

  • Relation R doesn't have any functional dependencies.

    You mean, non-trivial FDs (functional dependencies). (There must always be trivial FDs.)

    Assuming that the MVDs (multivalued dependencies) holding in R are those in the transitive closure of {A ↠ B, B ↠ D}:

    In 1 R1(A,B) R2(A,C,D,E), we can reconstruct R as R1 JOIN R2 and both R1 & R2 are in 4NF and their join will satisfy A ↠ B. If some component contained all the attributes of the other MVD then we could further decompose it per that MVD. And we would know that, given some alleged values for all components, their alleged reconstruction of R by joining would satisfy both MVDs. But here there is no such component. So we can't further decompose. And we know that an alleged reconstruction of R by joining satisfies A ↠ B but we would still have to check whether B ↠ D. We say that the MVD B ↠ D is "not preserved" and the decomposition to R1 & R2 "does not preserve MVDs".

    In 2 R3(B,D) R4(A,B,C,E), we can reconstruct R as R3 JOIN R4 and both R3 & R4 are in 4NF and the join will satisfy B ↠ D. Now some component contains all the attributes of the other MVD so we can further decompose it per that MVD. And we know that, given some alleged values for all components, their alleged reconstruction of R by joining satisfies both MVDs. That component is R4, which we can further decompose, reconstructing as AB JOIN ACE. And we know that an alleged reconstruction of R by joining satisfies both A ↠ B & B ↠ D. Because the MVDs in the original appear in a component, we say these decompositions "preserve MVDs".

    PS 1 The 4NF decomposition must be to three components

    MVDs always come in pairs. Suppose MVD X ↠ Y holds in a relation with attributes S, normalized to components XY & X(S-Y). Notice that S-XY is the set of non-X non-Y attributes, and X(S-Y) = X(S-XY). Then there is also an MVD X ↠ S-XY, normalized to components X(S-XY) & X(S-(S-XY)), ie X(S-XY) & XY, ie X(S-Y) & XY. Why? Notice that both MVDs give the same component pair. Ie both MVDs describe the same condition, that S = XY JOIN X(S-XY). So when an MVD holds, that partner holds too. We can write the condition expressed by each of the MVDs using the special explicit & symmetrical notation X ↠ Y | S-XY.

    We say a JD (join dependency) of some components of S holds if and only if they join to S. So if S = XY JOIN X(S-Y) = XY JOIN X(S-XY) then the JD *{XY, X(S-XY)} holds. Ie the condition that both MVDs describe is that JD. So a certain MVD and a certain binary JD correspond. That's one way of seeing why normalizing an MVD away involves a 2-way join and why MVDs come in pairs. The JDs that cause a 4NF relation to not be in 5NF are those that do not correspond to MVDs.

    Your example involves two MVDs that aren't partners & neither otherwise holds as a consequence of the other, so you know that the final form of a lossless decomposition will involve two joins, one for each MVD pair.

    PS 2 Ambiguity of "Suppose we have a relation with these multi-valued dependencies"

    When decomposing per FDs (functional dependencies) we are usually given a canonical/minimal cover for the relation, ie a set in a certain form whose transitive closure under Armstrong's axioms (set of FDs that must consequently hold) holds all the FDs in the relation. This is frequently forgotten when we are told that some FDs hold. We must either be given a canonical/minimal cover for the relation or be given an arbitrary set and be told that the FDs that hold in the relation are the ones in its transitive closure. If we're just given a set of FDs that hold, we know that the ones in its transitive closure hold, but there might be others. So in general we can't normalize.

    Here you give some MVDs that hold. But they aren't the only ones, because each has a partner. Moreover others might (and here do) consequently hold. (Eg X ↠ Y and Y ↠ Z implies X ↠ Z-Y holds.) But you don't say that they form a canonical or minimal cover. One way to get a canonical form for MVDs (a unique one per each transitive closure, hopefully more concise!) would be a minimal cover (one that can't lose any MVDs and still have the same transitive closure) augmented by the partner of each MVD. (Whereas for FDs the standard canonical form is minimal.) You also don't say "the MVDs that hold are those in the transitive closure of these". You just say that those MVDs hold. So maybe some not in the transitive closure do too. So your example can't be solved. We can guess that you probably mean that this is a minimal cover. (It's not canonical.) Or that the MVDs that hold in the relation are those in the transitive closure of the given ones. (Which in this case are then a minimal cover.)