Background
I have a Postgres 11 database running on RDS. I need to produce a report which shows counts of events over a certain period, grouping, renaming, and counting them based on description strings into categories according to business requirements.
Problem For this sort of thing, I typically go for a searched case expression, which is working well except: Part of the spec requires some events must be counted into more than one category. But when the case expression is evaluated, it counts [or appears to count] each row into only one category [the first matching one in the expression].
Example:
SELECT
CASE
WHEN (some_table.description ILIKE '%foo%' OR some_table.description ILIKE ‘%foobar%') THEN 'foo'
WHEN (some_table.description ILIKE '%bar%' OR some_table.description ILIKE ‘%foobar%) THEN 'bar'
END AS ‘category’,
count(*)
FROM some_table
GROUP BY category
ORDER BY category DESC
;
So, given data like
event_id | description
---------------------------------
1 | ‘string including foo’
2 | ‘foo also included in this string’
3 | ‘ this string includes bar’
4 | ‘this one says foobar’
output expected is like so:
some_table
category | count
---------------------------------
foo | 3
bar | 2
However, the actual output is being returned as
some_table
category | count
---------------------------------
foo | 3
bar | 1
How can I get records with description containing ‘foobar’ to be counted into both ‘foo’ and ‘bar’ categories? And if a case expression is the wrong method for this reporting goal, what should be used? All guidance appreciated!
The output stops at the first match. If you want both, use a lateral join:
SELECT v.category, COUNT(*)
FROM some_table t CROSS JOIN LATERAL
(VALUES (CASE WHEN s.description ILIKE '%foo%' OR s.description ILIKE '%foobar%' THEN 'foo' END),
(CASE WHEN s.description ILIKE '%bar%' OR s.description ILIKE '%foobar% THEN 'bar' END)
) v(category)
WHERE v.category IS NOT NULL
GROUP BY v.category
ORDER BY v.category DESC
;