Search code examples
joinrelational-databaserelational-algebranatural-join

Why do these sample relations have many valid joins but only one composition?


I am trying to understand some of the claims from a paper that introduced relational databases. Reference: https://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf

The author claims that some example relations have multiple joins but only one composition (see page 385 in the article).

Why is this the case? I wonder because a subset of R * S is a valid join, but not the case for R . S. I don't understand because one can take a valid subset of R . S in the same way as R * S.

R

1 a  
1 b
1 c
2 c
2 d
2 e

S

a  g
b  f
c  f
c  g
d  g
e  f

R * S (The natural join, as the author calls it; a subset of this would be a valid join as well.)

1 a g  
1 b f
1 c f
1 c g
2 c f
2 c g
2 d g
2 e g 

R ⋅ S (The natural composition, but the author claims there is only one composition despite the existence of multiple valid joins.)

1 f
1 g
2 f
2 g

Solution

  • If T is a composition of R with S, then by definition, there exists some join U of R with S such that T = π₁₃(U). (Note: since you're asking about a specific paper, I'm using the terms and definitions from that paper; but please be aware that the paper is more than fifty years old, and its terms and definitions do not match how they're used today.)

    So, let's consider that join U.

    By definition, π₁₂(U) = R and π₂₃(U) = S.

    Therefore, since R contains the pair (1, a) and no other pairs of the form (_, a), U must contain at least one triple of the form (1, a, _) and no other triples of the form (_, a, _). Likewise, since S contains the pair (a, g) and no other pairs of the form (a, _), U must contain at least one triple of the form (_, a, g) and no other triples of the form (_, a, _). So U must contain the triple (1, a, g), and T = π₁₃(U) must contain the pair (1, g).

    And, likewise for every other pair in the natural composition R·S: for each such pair p, there's no way to construct a join U such that π₁₂(U) = R and π₂₃(U) = S but p ∉ π₁₃(U).

    So T = R·S.