Search code examples
arraysjsonpostgresqlindexingjsonb

Create Postgres JSONB Index on Array Sub-Object


I have table myTable with a JSONB column myJsonb with a data structure that I want to index like:

{
  "myArray": [
    {
      "subItem": {
        "email": "[email protected]"
      }
    },
    {
      "subItem": {
        "email": "[email protected]"
      }
    }
  ]
}

I want to run indexed queries on email like:

SELECT *
FROM mytable
WHERE '[email protected]' IN (
  SELECT lower(
      jsonb_array_elements(myjsonb -> 'myArray')
      -> 'subItem'
      ->> 'email'
  )
);

How do I create a Postgres JSONB index for that?


Solution

  • If you don't need the lower() in there, the query can be simple and efficient:

    SELECT *
    FROM   mytable
    WHERE  myjsonb -> 'myArray' @> '[{"subItem": {"email": "[email protected]"}}]'
    

    Supported by a jsonb_path_ops index:

    CREATE INDEX mytable_myjsonb_gin_idx ON mytable
    USING  gin ((myjsonb -> 'myArray') jsonb_path_ops);
    

    But the match is case-sensitive.

    Case-insensitive!

    If you need the search to match disregarding case, things get more complex.

    You could use this query, similar to your original:

    SELECT *
    FROM   t
    WHERE  EXISTS (
       SELECT 1
       FROM   jsonb_array_elements(myjsonb -> 'myArray') arr
       WHERE  lower(arr #>>'{subItem, email}') = '[email protected]'
       );
    

    But I can't think of a good way to use an index for this.

    Instead, I would use an expression index based on a function extracting an array of lower-case emails:

    Function:

    CREATE OR REPLACE FUNCTION f_jsonb_arr_lower(_j jsonb, VARIADIC _path text[])
      RETURNS jsonb LANGUAGE sql IMMUTABLE AS
    'SELECT jsonb_agg(lower(elem #>> _path)) FROM jsonb_array_elements(_j) elem';
    

    Index:

    CREATE INDEX mytable_email_arr_idx ON mytable
    USING  gin (f_jsonb_arr_lower(myjsonb -> 'myArray', 'subItem', 'email') jsonb_path_ops);
    

    Query:

    SELECT *
    FROM   mytable 
    WHERE  f_jsonb_arr_lower(myjsonb -> 'myArray', 'subItem', 'email') @> '"[email protected]"';
    

    While this works with an untyped string literal or with actual jsonb values, it stops working if you pass text or varchar (like in a prepared statement). Postgres does not know how to cast because the input is ambiguous. You need an explicit cast in this case:

    ... @> '"[email protected]"'::text::jsonb;

    Or pass a simple string without enclosing double quotes and do the conversion to jsonb in Postgres:

    ... @> to_jsonb('[email protected]'::text);

    Related, with more explanation: