I have two tables details and data table. I already joined the two tables and the crosstab function is already done.
I want to show only the latest data per serial
. Refer to the current and desired output below.
Question:
How can I use DISTINCT ON
in this crosstab query?
Table details
:
serial | date | line | total_judgement
---------+----------------------+--------------+----------------
123 | 2016/05/21 12:00:00 | A | 1
456 | 2016/05/21 12:02:00 | A | 0
456 | 2016/05/21 12:05:00 | A | 0
Table data
:
serial | date | readings | value
---------+----------------------+------------+-------------
123 | 2016/05/21 12:00:00 | reading1 | 1.2342
123 | 2016/05/21 12:00:00 | reading2 | 2.3213
123 | 2016/05/21 12:00:00 | reading3 | 3.4232
456 | 2016/05/21 12:00:02 | reading1 | 1.2546
456 | 2016/05/21 12:00:02 | reading2 | 2.3297
456 | 2016/05/21 12:00:02 | reading3 | 3.4264
456 | 2016/05/21 12:00:05 | reading1 | 1.9879
456 | 2016/05/21 12:00:05 | reading2 | 2.4754
456 | 2016/05/21 12:00:05 | reading3 | 3.4312
Current Output:
serial | line | date | total_judgement| reading1 | reading2 | reading3
---------+------+----------------------+----------------+-----------+-------------+--------------
123 | A | 2016/05/21 12:00:00 | 1 | 1.2342 | 2.3213 | 3.4232
456 | A | 2016/05/21 12:00:02 | 0 | 1.2546 | 2.3297 | 3.4264
456 | A | 2016/05/21 12:00:02 | 0 | 1.9879 | 2.4754 | 3.4312
Desired Output:
serial | line | date | total_judgement | reading1 | reading2 | reading3
---------+------+----------------------+-----------------+-----------+-------------+--------------
123 | A | 2016/05/21 12:00:00 | 1 | 1.2342 | 2.3213 | 3.4232
456 | A | 2016/05/21 12:00:05 | 0 | 1.9879 | 2.4754 | 3.4312
Here's my Code:
SELECT * FROM crosstab (
$$ SELECT
tb2.serial,
tb1.line,
tb2.date,
tb1.total_judgement,
tb2.readings,
tb2.value
FROM
data tb2
INNER JOIN details tb1 ON (tb2.serial = tb1.serial
AND tb2.date = tb1.date)
ORDER BY tb2.date ASC $$,
$$ VALUES ('reading1'),('reading2'),('reading3')$$
) as ct("S/N" VARCHAR (50),
"Line" VARCHAR(3),
"Date" TIMESTAMP,
"TotalJudgement" CHARACTER(1),
"Reading1" FLOAT8,
"Reading2" FLOAT8,
"Reading3" FLOAT8);
I need to join the two tables on serial
and date
.
I think DISTINCT ON
may help with this but I don't seem to get correct results when I use DISTINCT ON serial
.
Making some assumptions, this is probably what you are looking for:
SELECT * FROM crosstab (
$$
SELECT t2.serial
, t1.line
, t2.date
, t1.total_judgement
, t2.readings
, t2.value
FROM (SELECT DISTINCT ON (serial) * FROM details ORDER BY serial, date DESC) t1
JOIN data t2 USING (serial, date)
ORDER BY t2.serial
$$
, $$VALUES ('reading1'),('reading2'),('reading3')$$
) AS ct("S/N" text
, "Line" text
, "Date" timestamp
, "TotalJudgement" text
, "Reading1" float8
, "Reading2" float8
, "Reading3" float8);
If you apply DISTINCT ON (serial)
after the join, you would only retain a single row from data
. Move the DISTINCT
step into a subquery on details
to get all readings in data
for the latest row of each serial
in details
.
BTW, DISTINCT
and DISTINCT ON ( expression [, ...] )
are not "functions" but SQL constructs. Basics:
While being at it I made some simplifications to the code. Not essential for the answer.
If there are many rows per serial
in table details
, it might be more efficient to use one of these techniques instead of DISTINCT ON
:
I can't be more specific without knowing table definitions, cardinalities etc.