I have a Postgres db containing a table with a jsonb
column tsent
that contains the results of targeted sentiment analysis of text in another column. The data is structured like so:
CREATE TABLE tbl(id int PRIMARY KEY, tsent jsonb);
INSERT INTO tbl VALUES
(1, '[
{
"class": 0,
"entity": "John Smith",
"sent_no": 0,
"class_label": "negative",
"entity_type": "PER",
"sentiment_prob": 0.95
},
{
"class": 1,
"entity": "University of Illinois",
"sent_no": 0,
"class_label": "neutral",
"entity_type": "ORG",
"sentiment_prob": 0.95
}
]');
It is an array of objects because a given text could have more than one entity. I am wondering how to write a query that would retrieve all records that have John Smith as an entity, and also those that have John Smith as an entity and class_label negative for him.
I am still learning jsonb and know how to query for a key being present, but not how to query for a key being present in any element in an array.
There are several ways to do that but this is probably the most generic and clear (yet not necessarily the most efficient) -
Flatten the table (a lateral join is insrtumental here):
select id, content, t.*
from tbl,
lateral jsonb_to_recordset(tsent) as t(
class integer,
entity text,
sent_no integer,
class_label text,
entity_type text,
sentiment_prob numeric
);
id | content | class | entity | sent_no | class_label | entity_type | sentiment_prob |
---|---|---|---|---|---|---|---|
1 | 0 | John Smith | 0 | negative | PER | 0.95 | |
1 | 1 | University of Illinois | 0 | neutral | ORG | 0.95 |
Query using the above "flat" table (t
CTE):
with t as
(
select id, content, t.*
from tbl,
lateral jsonb_to_recordset(tsent) as t(
class integer,
entity text,
sent_no integer,
class_label text,
entity_type text,
sentiment_prob numeric
)
)
select content, <other expression(s) here>
from t
where <your logic here>;