I have emails
table that has sender
and reporter
columns. I want to search given parameter in those columns and return unique values.
Let me explain with sample. This is my table and records:
CREATE TABLE public.emails (
id bigint NOT NULL PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY
(MAXVALUE 9223372036854775807),
sender jsonb NOT NULL,
reporter jsonb not null
);
insert into emails (sender, reporter) VALUES ('[{"email": "[email protected]", "name": "dennis1"}]', '[]');
insert into emails (sender, reporter) VALUES ('[{"email": "[email protected]", "name": "dennis1"}]', '[{"email": "[email protected]", "name": "john"}, {"email": "[email protected]", "name": "dennis1"}, {"email": "[email protected]", "name": "dennis2"}]');
insert into emails (sender, reporter) VALUES ('[{"email": "[email protected]", "name": "dennis1"}]', '[]');
insert into emails (sender, reporter) VALUES ('[{"email": "[email protected]", "name": "dennis1"}]', '[]');
I want to get email addresses and names. I also want to avoid dupes. Only one email and one name. I also don't want to get it as array, instead, one email and name per one row.
john
SELECT
* /* i don't know what to put here pr merge with reporters */
FROM "emails" AS "e"
WHERE (EXISTS (SELECT
*
FROM JSONB_ARRAY_ELEMENTS_TEXT("e"."sender") AS "e" ("email")
WHERE ("e"."email" ~* 'john' or "e"."name" ~* 'john'))
);
john
:email name
[email protected] john
SELECT
* /* i don't know what to put here pr merge with reporters */
FROM "emails" AS "e"
WHERE (EXISTS (SELECT
*
FROM JSONB_ARRAY_ELEMENTS_TEXT("e"."sender") AS "e" ("email")
WHERE ("e"."email" ~* '' or "e"."name" ~* ''))
);
email name
[email protected] john
[email protected] dennis1
[email protected] dennis2
dennis2
is in both sender
and reporter
, therefore, only one of them is required. No dupes.
In fact, in here there is one catch. If sender
or reporter
columns has at least one json object (not json array) then this query also fails.
cannot extract elements from an object
Which is an another story, tho.
How can i achieve my goal in this case?
Normalize your data before searching, then remove duplicates using distinct on ()
clause:
with cte as (select x ->> 'name' as name, x ->> 'email' as email
from emails as e, jsonb_array_elements(e.sender || e.reporter) as x)
select distinct on (email) * from cte where
name ~* '' or email ~* ''
--name ~* 'john' or email ~* 'john'
order by email;
Note that it will always scan the whole table, no indexes applicable in this case. Think about schema normalization.