The following query is attempting to exclude one array from the main array, then group results.
SELECT utc_offset, is_dst,
ltrim(
trim(string_agg(distinct (CASE WHEN abbrev NOT LIKE '+%' AND abbrev NOT LIKE '-%' AND abbrev != name THEN abbrev ELSE '' END), ' ')) ||
' ' || string_agg(name, ', ' ORDER BY name)
)
FROM pg_timezone_names
WHERE name NOT LIKE 'posix/%'
AND name NOT LIKE 'Etc/%'
AND (lower(abbrev) <> abbrev)
AND name NOT IN ('HST', 'Factory', 'GMT', 'GMT+0', 'GMT-0', 'GMT0', 'localtime', 'UCT', 'Universal', 'UTC', 'PST8PDT', 'ROK', 'W-SU', 'MST', 'CST6CDT')
EXCEPT
SELECT n.*, a.*
FROM pg_timezone_names n
JOIN pg_timezone_abbrevs a ON a.abbrev = n.name
WHERE n.utc_offset <> a.utc_offset
GROUP BY utc_offset, is_dst
ORDER BY utc_offset, is_dst
;
is instead generating an error:
ERROR: column "pg_timezone_names.utc_offset" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT utc_offset, is_dst,
The query works without the EXCEPT
block. Clearly there is a syntactic problem - the GROUP BY
(and by extension) ORDER BY
verbs are not being seen.
How should this query then be cast?
The stated objective is to ...
exclude one array from the main array, then group results.
I think you really want this:
"Exclude time zones where an abbreviation with identical name but different offset exists - before aggregating qualifying rows."
EXCEPT
requires compatible SELECT
lists left and right.SELECT n.*, a.*
is a repeat offender: it also disagrees with the GROUP BY
clause - the immediate cause for the observed error message.You attempt would work like this:
SELECT utc_offset, is_dst,
ltrim(
trim(string_agg(distinct (CASE WHEN abbrev NOT LIKE '+%' AND abbrev NOT LIKE '-%' AND abbrev != name THEN abbrev ELSE '' END), ' ')) ||
' ' || string_agg(name, ', ' ORDER BY name)
)
FROM (
SELECT utc_offset, is_dst, name, abbrev
FROM pg_timezone_names
WHERE name NOT LIKE 'posix/%'
AND name NOT LIKE 'Etc/%'
AND (lower(abbrev) <> abbrev)
AND name NOT IN ('HST', 'Factory', 'GMT', 'GMT+0', 'GMT-0', 'GMT0', 'localtime', 'UCT', 'Universal', 'UTC', 'PST8PDT', 'ROK', 'W-SU', 'MST', 'CST6CDT')
EXCEPT
SELECT n.utc_offset, n.is_dst, n.name, n.abbrev
FROM pg_timezone_names n
JOIN pg_timezone_abbrevs a ON a.abbrev = n.name
WHERE n.utc_offset <> a.utc_offset
) sub
GROUP BY utc_offset, is_dst
ORDER BY utc_offset, is_dst
;
But don't. Rather use this 100 % equivalent ...
SELECT utc_offset, is_dst
, concat_ws(' ', string_agg(DISTINCT abbrev, ' ' ORDER BY abbrev) FILTER (WHERE NOT abbrev ^@ ANY ('{+,-}') AND abbrev <> name)
, string_agg(name, ', ' ORDER BY name)) AS abbrevs_and_names
FROM pg_timezone_names n
WHERE lower(abbrev) <> abbrev
AND NOT name ^@ ANY ('{posix/, Etc/}')
AND name <> ALL ('{HST, Factory, GMT, GMT+0, GMT-0, GMT0, localtime, UCT, Universal, UTC, PST8PDT, ROK, W-SU, MST, CST6CDT}')
AND NOT EXISTS ( -- !!!
SELECT FROM pg_timezone_abbrevs a
WHERE a.abbrev = n.name
AND a.utc_offset <> n.utc_offset
)
GROUP BY utc_offset, is_dst
ORDER BY utc_offset, is_dst
;
NOT EXISTS
is much easier than EXCEPT
here. (It is most of the time.) See:
I threw in a couple other (optional) simplifications & optimizations:
About the "starts with" operator ^@
(requires at least Postgres 11):
About the ANY
construct:
About null-safe concatenation with concat_ws()
:
About the aggregate FILTER
clause: