Search code examples
databasepostgresqljoincartesian-product

How to avoid cross product/Cartesian/full join in SQL?


We have a table A with column "ETA To Destination" in minutes.

id ETA To Dest (mins) Score
1 45
2 75

There is a configuration table where we have defined scores based on the time intervals.

For example - If the ETA is between 31 to 60 mins, score is 5 points, ETA is between 61 to 120 mins, score is 10 points,

From Time To Time Score
0 30 2
31 60 5
61 120 10

We need to find the score (in table A) based on the ETA minutes in the configuration table (between from and to time).

I do not want to write a full join that will do a sequential scan in the table. Is there a better alternative or a better design.

Table A can have million of records and configuration table can have few more additional time interval rows.


Solution

  • SELECT a.id, a.eta, s.score
    FROM   a
    JOIN   score_tbl s ON a.eta BETWEEN s.from_time AND s.to_time
    ORDER  BY a.eta DESC NULLS LAST  -- how to break ties??
    LIMIT  50;
    

    With a matching index on a(eta) to support this query, it's a matter of milliseconds. The LIMIT 50 you mentioned in a comment makes all the difference. Ideally:

    CREATE INDEX a_eta_id_idx on a (eta DESC NULLS LAST) INCLUDE (id);
    

    We can keep it that simple because higher eta correlates to higher score.

    Details depend on undisclosed information. Basically, any B-tree index with eta as leading or only column will do. But you'll have to match yet undisclosed specifications how to break ties.

    DESC NULLS LAST only if eta can be null (isn't defined NOT NULL). Else just DESC. See:

    If there can be duplicates on eta, you'll have to define how to break ties. My query makes an arbitrary pick. But that won't be good enough for proper pagination. See: