Search code examples
relational-databaserelational-algebra

Find all pizzerias that serve ONLY pizzas eaten by people over 30


I know the correct answer but I'm not quite sure about ONE part of it.

RA is a relational algebra interpreter that translates relational algebra queries into SQL queries, then executes the SQL on a standard relational database system. So the language is similar to SQL but based on relational algebra.

Here is the sample database:

Serves(pizzeria, pizza, price)
Eats (name, pizza) 
Person (name, age, gender) 

Find all pizzerias that serve ONLY pizzas eaten by people over 30

    \project_{pizzeria} Serves
\diff
    \project_{pizzeria} (
            Serves
        \join (
                (\project_{pizza} Serves)
            \diff (\project_{pizza} ((\select_{age>'30'} Person) \join Eats))
        )
    )

What I fail to understand is the last part of the query:

\project_{pizza} ((\select_{age>'30'} Person) \join Eats)

Wouldn't you want to diff age < 30 not age > 30? This would minus all the ages that are less than 30 and leave you with all the ages > 30 no? Yet, I know this is wrong. Can someone explain the logic behind this?


Solution

  • Don't jump to conclusions. Read carefully.

    project {pizza} (select {age>'30'} (Person) join Eats)
    

    That expression involves people over 30. But it is not the final answer.

    You subtract those tuples from something else. If you subtracted people over 30 from all people then you would have people not over 30. This is not people over 30 and you are not subtracting it from all people. It is pizzas eaten by people over 30 and you are subtracting it from all pizzas to get pizzas not eaten by people over 30.

        project {pizza} (Serves)
    -   project {pizza} (select {age>'30'} (Person) join Eats)
    

    Later you get those pizzas' pizzerias, ie the pizzerias that serve pizzas not eaten by people over 30.

    project {pizzeria} (
            Serve
        join (
                project {pizza} (Serves)
            -   project {pizza} (select {age>'30'} (Person) join Eats)
            )
    

    Then you subtract them from all pizzerias to get pizzerias that only serve pizzas eaten by people over 30.

        project {pizzeria} (Serves)
    -   project {pizzeria} (
               Serve
           join (
                   project {pizza} (Serves)
               -   project {pizza} (select {age>'30'} (Person) join Eats)
               )
           )
    

    So you never diff/minus/subtract people over 30. In particular, you never subtract them from all people to get people not over 30.