Search code examples
jsonpostgresqlindexinggwt-ginjsonb

Pattern matching on jsonb key/value


I am using PostgreSQL 9.4. My table has a jsonb column:

CREATE TABLE "PreStage".transaction (
  transaction_id serial NOT NULL,
  transaction jsonb
  CONSTRAINT pk_transaction PRIMARY KEY (transaction_id)
);

CREATE INDEX idxgin ON "PreStage".transaction USING gin (transaction);

I store transactions in terms of key / value in the JSONB column. One of the requirements is to search customer name from the key value, hence I am running a query like:

SELECT transaction as data FROM "PreStage".transaction
WHERE  transaction->>('HCP_FST_NM') ilike ('%neer%');

What ever I do seems the query doesn't like the GIN index. How can I make the query use a GIN index with case insensitive pattern search?

I tried changing jsonb column to text, indexing it using gin_trgm_ops then search for required text, then converting the result to json and then searching in the required key/value. This approach doesn't seem to work.


Solution

  • The default GIN index operator class jsonb_ops does not allow full-text pattern matching on a value. Details:

    The best indexing strategy depends on your complete situation. There are many options. To just cover the one key you presented, you could use a functional trigram index. You already tested gin_trgm_ops, so you are already familiar with the additional module pg_trgm. For those who are not:

    Once the module is installed:

    CREATE INDEX idxgin ON "PreStage".transaction
    USING gin ((transaction->>'HCP_FST_NM') gin_trgm_ops);
    

    Then this query is supported:

    SELECT transaction AS data
    FROM   "PreStage".transaction
    WHERE  transaction->>'HCP_FST_NM' ILIKE '%neer%';
    

    I also removed some unnecessary parentheses.

    Depending on unknown details, there are various options to optimize index coverage.

    For instance, if many rows don't have a key 'HCP_FST_NM' at all, make that a partial index to exclude irrelevant rows and keep the index small:

    CREATE INDEX idxgin ON "PreStage".transaction
    USING gin ((transaction->>'HCP_FST_NM') gin_trgm_ops)
    WHERE transaction ? 'HCP_FST_NM';
    

    ? being the jsonb containment operator.
    And add the same predicate to every query that's supposed to use this index:

    SELECT transaction AS data
    FROM   "PreStage".transaction
    WHERE  transaction->>'HCP_FST_NM' ILIKE '%neer%'
    AND    transaction ? 'HCP_FST_NM';  -- even if that seems redundant.