I have got table.
ID section_name purchaseNumber docPublishDate parsing_status
88940 notifications 0373200124515000035 2015-02-27 null
88939 notifications 0373200124515000035 2015-02-29 null
88938 notifications 0373200124515000034 2015-02-27 null
88687 notifications 0373100064615000120 2015-02-28 null
88937 notifications 0373100064615000120 2015-02-27 null
89122 notifications 0373100064615000120 2015-02-27 null
88936 notifications 0373200124515000032 2015-02-27 null
88988 notifications 0373100064615000120 2015-03-02 null
88696 notifications 0373100066915000038 2015-02-27 null
88963 notifications 0373200174515000013 2015-02-27 null
It have some duplicates in purchaseNumber
. I need select from this table only newest records for processing.
In other words I need to:
purchaseNumber
docPublishDate
and select newest.parsing_status
true
(suppose it was newest from step 2) any elements SHOULD NOT be returned.After processing parsing_status
flag is changing to true
.
docPublishDate
- TimeStamp
The items with old dates should not be selected if at last one newest was processed. I tried to do some grouping. But code simply return me all data:
SELECT
"id", "section_name", "purchaseNumber", "docPublishDate", "parsing_status"
FROM "xml_files" WHERE parsing_status IS NULL GROUP BY "purchaseNumber", "id", "section_name", "docPublishDate", "parsing_status" ORDER BY "docPublishDate" DESC
I understand that you want the latest row per purchase, excepted those whose parsing status is true. One option uses distinct on
in a subquery to get the latest element, and then filters in an outer query:
select *
from (
select distinct on (purchasenumber) x.*
from xml_files x
order by purchasenumber, docpublishdate desc
) x
where parsing_status is distinct from true