Search code examples
postgresqljsonb

"Big" data in a JSONB column


I have a table with a metadata column (JSONB). Sometimes I run queries on this column. Example:

select * from "a" where metadata->'b'->'c' is not null

This column has always just small JSON objects <1KB. But for some records (less than 0.5%), it can be >500KB, because some sub-sub-properties have many data.

Today, I only have ~1000 records, everything works fine. But I think I will have more records soon, I don't know if having some big data (I don't speak about "Big Data" of course!) will have a global impact on performance. Is 500KB "big" for postgres and is it "hard" to parse? Maybe my question is too vague, I can edit if required. In other words:

Is having some (<0.5%) bigger entries in a JSONB column affect noticeably global performance of JSON queries?

Side note: assuming the "big" data is in metadata->'c'->'d' I don't run any queries to this particular property. Queries are always done on "small/common" properties. But the "big" properties still exists.


Solution

  • It is a theoretical question, so I hope a generic answer will satisfy.
    If you need numbers, I recommend that you run some performance tests. It shouldn't be too hard to generate some large jsonb objects for testing.

    As long as the data are jsonb and not json, operations like metadata->'b'->'c' will be fast. Where you could lose time is when a large jsonb value is loaded and uncompressed from the TOAST table (“detoasted”).

    You can avoid that problem by creating an index on the expression. Then an index scan does not have to detoast the jsonb and hence will be fast, no matter how big the jsonb is. So I think that you will not encounter problems with performance as long as you can index for the queries.

    You should pay attention to the total size of your database. If the expected size is very large, backups will become a challenge. It helsp to design and plan how to discard old and unused data. That is an aspect commonly ignored during application design that tends to cause headaches years later.