Search code examples
database-normalizationdecomposition

Proof that a decomposition into 3 relations is lossless


Given this relation scheme with set of attributes R and set of dependencies F:

R = (ABCD) F = {AB -> C, B -> D; C -> A} 

The function dependency B -> D violate BCNF because B is not a superkey so I converted the relation in BCNF by decomposing it in 3 relations using this algorithm:

Given a schema R.

    Compute keys for R.
    Repeat until all relations are in BCNF.
        Pick any R' having a F.D A --> B that violates BCNF.
        Decompose R' into R1(A,B) and R2(A,Rest of attributes).
        Compute F.D's for R1 and R2.
        Compute keys for R1 and R2.

The result I got (which is correct as I checked the available solution) is:

R1:(BD), R2:(CA), R3:(BC).

I know that a property of the conversion algorithm is that the decomposition preserves the data and I want to prove it as en exercise.

Usually with a decomposition into two relations R1 and R2 the procedure is: check for the attributes in common between R1 and R2, do the closure of the result you found, if the closure include all the attributes of either R1 or R2 then the decomposition preserve the data, else is does not.

In the case of this exercise there are no attributes in common between R1,R2 and R3, so I can't do the closure to determine if the decomposition preserve data or not and I don't know how else I could proceed. What should I do the prove that the decomposition is lossless?


Solution

  • To show that the decomposition is lossless, you can proceed in two steps, along the lines of the steps of the decomposing algorithm.

    Starting from your schema, let’s apply the first step of the algorithm.

    (1) R = (ABCD) F = {AB -> C, B -> D; C -> A}
    

    considering that B -> D violates the BCNF (since the candidate keys are AB and BC), we decompose R in:

    (2) R1 = (BD), F1 = {B -> D} and R2 = {ABC}, F2 = {C -> A, AB -> C}
    

    Here we can prove that R1 and R2 are a lossless decomposition, since their intersection is {B}, which is a candidate key for F1 (according to the theorem the you cited).

    Now, since R2 is not in BCNF because of C -> A, according to the algorithm we must decompose R2 in R3 = (CA) and R4 = (CB), so the final decomposition is {R1 = (BD), R3 = (CA), R4 = (CB)}. To show the this decomposition of R is lossless, we can use another theorem that says:

    If ρ = {R1,..., Rm} is a lossless decomposition of R<T,F> (where T are the attributes of R and F are a cover of the dependencies of R), and σ = {S1, S2} a lossless decomposition of R1 with respect to π(T1)(F), then the decomposition {S1, S2, R2, ..., Rm) is lossless with respect to F.

    In the theorem, π(T1)(F) is the projection of the dependencies F onto the attributes T1 of R1.

    In this case, we decompose R2(ABC) and π(T2)(F) = {C -> A, AB -> C}, so the theorem can be applied since R3 and R4 are a lossless decomposition with respect to those dependencies.