I have company, location and product details in table R:
company location product
------------------------------
abc hilltop alpha
abc hilltop beta
abc riverside alpha
abc riverside beta
buggy underbridge gama
buggy underbridge theta
buggy underbridge omega
The relationships are multivalued. And the data needs to be decomposed according to normalization as the MVD's are
company ->> location
and company ->> product
where company
is not a candidate key)or
company U location < R
and so with product
).But my colleague disagrees with me, who insists that for a relation to have multivalued dependency at least four values should exist in the company column for each company:
t1(company) = t2(company) = t3(company) = t4(company)
For company abc this is true. But for company buggy, which has only one product in three locations, this is untrue.
For the formal definition and similar examples I referenced https://en.wikipedia.org/wiki/Multivalued_dependency and https://en.wikipedia.org/wiki/Fourth_normal_form .
I too started seeing the same question after reading the formal definition.
How does this relation still have this MVD even though it does not satisfy the formal definition?
(I am not asking how to normalize this data in to 4NF. I need to break it into two tables--company
-location
and company
-product
.)
"There exist" says some values exist, and they don't have to be different. EXISTS followed by some name(s) says that there exist(s) some value(s) referred to by the name(s), for which a condition holds. Multiple names can refer to the same value. (FOR ALL can be expressed in terms of EXISTS.)
The notion of MVD can be applied to both variables and values. In fact the form of the linked definition is that a MVD holds in the variable sense when it holds in the value sense "in any legal relation". To know that a particular value is legal, you need business knowledge. You can then show whether that value satisfies an MVD. But to show whether its variable satisfies the MVD you have to show that the MVD is satisfied "in any legal relation" value that the variable can hold. One valid value can tell you that a MVD doesn't hold in (it and) its variable, but it can't tell you that a MVD does hold in its variable. That requires more business knowledge.
You can show that this value violates 4NF by using that definition of MVD. The definition says that a relation variable satisfies a MVD when a certain condition holds "for any valid relation" value:
for all pairs of tuples t1 & t2 in r such that t1[a] = t2[a] there exist tuples t3 & t4 [...]
For what MVD and values for t1 & t2 does your colleague claim there doesn't exist values for t3 & t4? There is no such combination of MVD and values for t1 & t2. Eg for {company} ↠ {product} and t1 & t2 both (buggy, underbridge, gamma), we can take (company, underbridge, gamma) as a value for both t3 & t4, and so on for all other choices for t1 & t2.
Another definition for F ↠ T holding is that binary JD (join dependency) *{F U T, F U (A - T)} holds, ie that the relation is equal to the join of its projections on F U T & F U (A - T). This definition might be more immediately helpful to you & your colleague in that it avoids the terminology that you & they are misinterpreting. Eg your example data is the join of these two of its projections:
company location
--------------------
abc hilltop
abc riverside
buggy underbridge
company product
----------------
abc alpha
abc beta
buggy gamma
buggy theta
buggy omega
So it satisfies the JD *{{company, location}, {company, product}}, so it satisfies the MVDs {company} ↠ {location} and {company} ↠ {product} (among others). (Maybe you will be able to think of examples of relations with zero, one, two, three etc tuples for which one or more (trivial and/or non-trivial) MVDs hold.)
Of course, the two definitions are two different ways of describing the same condition.
PS 1 Whenever a FD F → T holds, the MVD F ↠ T holds. For a relation in BCNF, the MVDs that violate 4NF & 5NF are those not so associated with FDs.
PS 2 A relation variable is meant to hold a tuple if and only if it makes a true statement in business terms when its values are substituted into a given statement template, or predicate. That plus the JD definition for MVD gives conditions for a relation variable satisfying a MVD in business terms. Here our predicate is of the form ...company...location...product...
. (Eg company named
company
is located at
location
and makes product
product
.) It happens that this MVD holds for a variable when for all valid business situations, FOR ALL company, location, product
,
EXISTS product [...company...location...product...]
AND EXISTS location [...company...location...product...]
IMPLIES ...company...location...product...