Search code examples
sqlpostgresqldistinct-on

SELECT DISTINCT ON in Postgresql returns multiple rows


When I run

SELECT DISTINCT ON (ts) * 
FROM tbl 
WHERE clause=100 
AND ts <= '2018-04-02 15:11:18.819000' 
AND ts > '2018-04-02 15:06:18.819000' 
ORDER BY ts, version ASC;

I expect a single row corresponding to the largest ts that meets the condition, with ties being broken by taking the lowest version.

The query returns

         ts                     version
'2018-04-02 15:07:04.828'   ...   1
'2018-04-02 15:07:05.706'   ...   1

I don't understand why two non-distinct ts are being returned. Is there a similar query that will return the desired result?


Solution

  • Distinct on returns one row for each combination of the keys following the clause. In this case, each ts would appear once. Which one? That is determined by the order by. The keys following the distinct on keys determine the first, so that would be the lowest version number.

    Your query appears to be producing reasonable results.

    If you want one row in the result set with the largest ts, then use order by and limit/fetch first 1 row only:

    SELECT * 
    FROM tbl 
    WHERE clause = 100 AND
          ts <= '2018-04-02 15:11:18.819000' AND
          ts > '2018-04-02 15:06:18.819000' 
    ORDER BY ts DESC, version ASC
    FETCH FIRST 1 ROW ONLY;
    

    Note the ts DESC in the ORDER BY, to get the most recent value of ts.