I have a table like so:
|uniqueID|scandatetime |scanfacilityname|
+--------+-------------------+----------------+
|12345678|01-01-2020 13:45:12|BALTIMORE |
|12345678|01-02-2020 22:45:12|BALTIMORE |
|12345678|01-04-2020 10:15:12|PHILADELPHIA |
|12345678|01-05-2020 08:45:12| |
And I would like to return an entire row containing the uniqueID, scandatetime, and latest scanfacilityname (i.e., max scandatetime where scanfacilityname is not null). I have tried the following query:
SELECT
"uniqueID"
, "max"(CAST("scandatetime" AS timestamp)) "timestamp"
, COALESCE("scanfacilityname") "scanfacilityname"
FROM
iv_scans_new.scan_data
WHERE (("partition_0" = '2020') AND ("partition_1" IN ('06', '07', '08'))) and scanfacilityname is not null
group by 1, 3
;
But im not sure if this is correct/if I need the coalesce.
You can use the max_by
function:
select max_by(uniqueID, scanfacilityname), max_by(scandatetime, scanfacilityname), max(scanfacilityname)
See the doc.
There's no coalesce
needed since the max
and max_by
functions will effectively ignore null
values.