Consider a table defined as follows:
CREATE TABLE test (
id int4 NOT NULL,
tag_counts _jsonb NOT NULL DEFAULT ARRAY[]::jsonb[]
);
INSERT INTO test(id, tag_counts) values(1,array['{"type":1, "count":4}','{"type":2, "count":10}' ]::jsonb[])
How can I create an index on json key type
and how can I query on it?
Edit: Previously, there were no indexes on json keys and select queries used an unnest
operation as shown below:
select * from (SELECT unnest(tag_counts) as tc
FROM public.test) as t
where tc->'type' = '2';
The problem is, if the table has a large number of rows, the above query will not only include a full table scan, but also filtering through each jsonb array.
There is a way to index this, not sure how fast it will be.
If that was a "regular" jsonb
column, you could use a condition like where tag_counts @> '[{"type": 2}]'
which can use a GIN index on the column.
You can use that operator if you convert the array to "plain" json value:
select *
from test
where to_jsonb(tag_counts) @> '[{"type": 2}]'
Unfortunately, to_jsonb()
is not marked as immutable (I guess because of potential timestamp conversion in there) which is a requirement if you want to use an expression in an index.
But for your data, this is indeed immutable, so we can create a little wrapper function:
create function as_jsonb(p_input jsonb[])
returns jsonb
as
$$
select to_jsonb(p_input);
$$
language sql
immutable;
And with that function we can create an index:
create index on test using gin ( as_jsonb(tag_counts) jsonb_path_ops);
You will need to use that function in your query:
select *
from test
where as_jsonb(tag_counts) @> '[{"type": 2}]'
On a table with a million rows, I get the following execution plan:
Bitmap Heap Scan on stuff.test (cost=1102.62..67028.01 rows=118531 width=252) (actual time=15.145..684.062 rows=147293 loops=1)
Output: id, tag_counts
Recheck Cond: (as_jsonb(test.tag_counts) @> '[{"type": 2}]'::jsonb)
Heap Blocks: exact=25455
Buffers: shared hit=25486
-> Bitmap Index Scan on ix_test (cost=0.00..1072.99 rows=118531 width=0) (actual time=12.347..12.356 rows=147293 loops=1)
Index Cond: (as_jsonb(test.tag_counts) @> '[{"type": 2}]'::jsonb)
Buffers: shared hit=31
Planning:
Buffers: shared hit=23
Planning Time: 0.444 ms
Execution Time: 690.160 ms