Search code examples
sqldatabaserelational-databaserdbmsdatabase-theory

Is there ALWAYS a "base table" in any database query?


Ok this is slightly theoretical so it would be great if an unbiased database enthusiast gave an opinion.

For the sake of argument let's agree that there is such a concept as a "base table" w.r.t. to a query, where one table is driving the majority of information of the result set. Imagine a query where there are three relations - TableA, TableB, and TableC

Let's say TableA has cardinality of 1 million records and TableC has 500 records and TableC has 10,000.

Let's say the query is like so -

SELECT A.Col1
     , A.Col2
     , A.Col3
     , A.Col4
     , A.Col5
FROM TableA A
 LEFT JOIN TableB B ON B.ID = A.TableBID
 LEFT JOIN TableC C ON C.ID = A.TableCID

Ok, clearly TableA is the base relation above. It is the biggest table, it is driving the result set by being joined "from", and visually the columns are even on the "left side" of the result set. (The left side thing actually was a criterion to my colleague).

Now, let's assume that TableA has 1 million rows again, TableB is a "junction" or "bridge" table and has like 500,000 rows and TableC has 1,000,000 rows. So assume the query is just an outer join to get all columns in TableA and TableC where a relationship exists like below...

SELECT A.*
     , C.*
FROM TableC C
 FULL OUTER JOIN TableB B ON C.ID = B.TableAID
 FULL OUTER JOIN TableA A ON A.ID = B.TableCID

Ok so given the last query, can anyone tell me what the "base relation" is? I don't think there is one, but was hoping for another database person's opinion.


Solution

  • Let me suggest a perspective where the base table is the first one in the FROM clause (ie not a JOINed table). In the case where a statement can be equally written with either one table or another as base table, we would say that there are two (or more) base tables.

    In your first query, the base table is TableA. If you invert TableA and TableC in the query, you are not guaranteed to get the same results, because of the LEFT JOIN.

    In the second query, as you are using FULL JOINs, all 3 tables could be inverted without changing the result, so this is indeed a use case of a query where all tables are base tables.