Search code examples
sqlpostgresqljsonb

How to query jsonb column that is an array of objects?


I have a Postgres db containing a table with a jsonb column tsent that contains the results of targeted sentiment analysis of text in another column. The data is structured like so:

CREATE TABLE tbl(id int PRIMARY KEY, tsent jsonb);

INSERT INTO tbl VALUES
  (1, '[
    {
        "class": 0,
        "entity": "John Smith",
        "sent_no": 0,
        "class_label": "negative",
        "entity_type": "PER",
        "sentiment_prob": 0.95
    },
    {
        "class": 1,
        "entity": "University of Illinois",
        "sent_no": 0,
        "class_label": "neutral",
        "entity_type": "ORG",
        "sentiment_prob": 0.95
    }
   ]');

It is an array of objects because a given text could have more than one entity. I am wondering how to write a query that would retrieve all records that have John Smith as an entity, and also those that have John Smith as an entity and class_label negative for him.

I am still learning jsonb and know how to query for a key being present, but not how to query for a key being present in any element in an array.


Solution

  • There are several ways to do that but this is probably the most generic and clear (yet not necessarily the most efficient) -

    • Flatten the table with a JSON object into a "regular" table
    • do whatever querying you need using this table

    Flatten the table (a lateral join is insrtumental here):

     select id, content, t.*
     from tbl,
      lateral jsonb_to_recordset(tsent) as t(
          class integer,
          entity text,
          sent_no integer,
          class_label text,
          entity_type text, 
          sentiment_prob numeric
      );
    
    id content class entity sent_no class_label entity_type sentiment_prob
    1 0 John Smith 0 negative PER 0.95
    1 1 University of Illinois 0 neutral ORG 0.95

    Query using the above "flat" table (t CTE):

    with t as 
    (
     select id, content, t.*
     from tbl,
        lateral jsonb_to_recordset(tsent) as t(
          class integer,
          entity text,
          sent_no integer,
          class_label text,
          entity_type text, 
          sentiment_prob numeric
      )
    )
    select content, <other expression(s) here>
    from t
    where <your logic here>;