Search code examples
sqlpostgresqlquery-optimizationjsonbjsonb-array-elements

Postgres, searching with ILIKE operator from json array works different, than from regular table


In our DB we have tables, that have 6-8 M records, this data is static and in order of performance optimization we decided to store some data from this tables in separate table, and for predefined filters use the values from this table. For example we want to store distinct states, cities, tags in this table.

The example of data could be represented in such test table:

CREATE TABLE test_data_table (
                                 id    BIGSERIAL PRIMARY KEY,
                                 state  text,
                                 city text
);

INSERT INTO test_data_table (state, city)
    values  ('MD', 'Union Bridge'),
            ('IL', 'Bourbonnais'),
            ('NC', 'Rdelk'),
            ('FL', 'Bonita Springs'),
            ('FL', 'Unit Ave Maria'),
            ('AZ', 'N Placita Chula Vista Tucson'),
            ('TX', 'Sienna'),
            ('LA', 'Lornager'),
            ('CA', 'Empire'),
            ('MA', 'Middleton'),
            ('CO', 'Yampa'),
            ('GA', 'Spr Project Name Unit'),
            ('IN', 'Greencastle'),
            ('NV', 'Flora'),
            ('RI', 'Coventry'),
            ('WA', 'Rice'),
            ('FL', 'Orange City'),
            ('TX', 'Haltom City'),
            ('CA', 'Moraga'),
            ('LA', 'Belle Chasse'),
            ('AZ', 'E Rincon Creek Ranch Rd Tucson'),
            ('CA', 'Acampo'),
            ('AZ', 'S Avra Rd Tucson'),
            ('GA', 'Folkston'),
            ('FL', 'Biscayne Park'),
            ('FL', 'All Units Oxford'),
            ('IL', 'Volo'),
            ('MN', 'Peterson'),
            ('LA', 'Rayville'),
            ('KY', 'South Park View'),
            ('AZ', 'E Camino Emmanuel Sahuarita'),
            ('CA', 'Wrighwood'),
            ('CA', 'Los Angeles'),
            ('AZ', 'N Teal Blue Tr Tucson'),
            ('NY', 'Clifton Park'),
            ('IN', 'Frankl'),
            ('KY', 'Anchorage'),
            ('LA', 'Crown Point'),
            ('CA', 'Los BanosStruct CalcsSpan T'),
            ('IL', 'Chebanse');

And here the example of separate table for filters and the way to fill it with values:

    CREATE TABLE test_filter_values
(
    id    BIGSERIAL PRIMARY KEY,
    type  text,
    value jsonb
);

INSERT INTO test_filter_values (type, value)
VALUES ('CITY', (SELECT json_agg(DISTINCT city)::jsonb FROM test_data_table));

INSERT INTO test_filter_values (type, value)
VALUES ('STATE', (SELECT json_agg(DISTINCT state)::jsonb FROM test_data_table));

So, the main issue happens when we try to query data from this table, here is the example:

SELECT DISTINCT city from test_data_table
WHERE city ILIKE 'b%';

This query returns 4 results; And this one:

SELECT city FROM (SELECT jsonb_array_elements(value) AS city
                  FROM test_filter_values WHERE type = 'CITY') cities
WHERE city::text ILIKE 'b%';

Returns no results, LIKE operator without % at the begining or in the end returs zero results from test_filter_values table, but if using LIKE '%b%' with such case the behaviour is the same. The same issue happens with the exact match:

SELECT state FROM (SELECT jsonb_array_elements(value) AS state
                   FROM test_filter_values WHERE type = 'STATE') states
WHERE state::text = 'NC';

It returns no results either.

So how could we deal in this case?


Solution

  • That's because jsonb_array_elements returns jsonb not text. A JSON array might have other arrays, objects, arrays of objects etc inside it.

    That means the text values will be quoted as JSON and you will be trying to match against "Anchorage" not Anchorage when that gets converted to text.

    Maybe just store an array of text if that's what you want to use?