Here is a question that was thrown to me recently, given a relation schema R = {A,B,C,D,E,G}
and with a set of FDs on R, F = {DG -> AE, G -> C, BG -> AE, D -> CG}
, is any BCNF decomposition that is lossless and dependency-preserving?
I tried different BCNF decompositions on the relation R, but I could not find a satisfiable decomposition.
Hence, I attempted to create a 3NF schema using Bernstein's Synthesis algorithm. I had computed the minimal cover to be F'= {BG -> A, BG -> E, G -> C, D -> G, D -> E, D -> A}
. And from here, I just broke it up into smaller schemas: {BGA}, {BGE}, {GC}, {DG}, {DE}, {DA}
.
But isn't this in BCNF? If it is in BCNF, it looks to be lossless and dependency preserving as well. I have checked for the preservation of dependencies by using the functional dependency closure, F+. Everything seems legitimate. Even after discussing with my classmates, we are all confused as to why it is such.
I was taught that the BCNF decomposition which I am using: find a violating FD in F that holds in R and remove it, would be able to find a valid decomposition that is both lossless and dependency preserving if there is one. I believe I followed the steps clearly, so the 3NF schema I computed should be correct. As for the BCNF decomposition, I followed the algorithm to the book, which is find the violating FD and make it a sub relation, and keep only the determinant of the FD in the leftover relation and repeat. But I could not arrived the schema: {BGA}, {BGE}, {GC}, {DG}, {DE}, {DA}
.
So the question I have is why am I able to find a satisfactory BCNF decomposition using the synthesis algorithm, where the decomposition is just the minimal cover, yet I am unable to do it using the textbook decomposition algorithm? (Assuming all the steps I took are correct)
Edit: this is a link to the BCNF steps
A few facts, assuming that the FDs given form a canonical cover of all the FD of the relation.
The only candidate key of the original relation is {BD}
. So, the decomposition in 3NF that you propose is not lossless, since no relation schema contains both attributes.
Following closely the Bernstein's Synthesis algorithm gives in fact the following decomposition:
R1 (A B E G) with cover of projected dependencies {BG → E, BG → A} and candidate key BG
R2 (A D E G) with cover of projected dependencies {D → G, D → E, D → A} and candidate key D
R3 (C G) with cover of project dependencies {G → C} and candidate key G
R4 (B D) with no non-trivial dependencies
This decomposition preserves both data and dependencies. Moreover, all the schemas satisfy also the BCNF.
So, the question now is: why the Analysis algorithm to find the BCNF produces in this case only decompositions with loss of dependencies? Well, the answer is simply that such algorithm, even if applied by eliminating the FD that violates the BCNF in any order, does not guarantee to find all the possibile decompositions (and does not guarantee either that the solutions found preserve the dependencies). Simply it is a well known (and simple) algorithm that decomposes in BCNF, and for this reason, it is found in many books on databases. For the above (and other) reasons, in practice very often the 3NF is considered a reasonable alternative the the BCNF.