I am trying to learn some postgres skills to help me building my flask application and I really care about performance but I faced some issues.
I have the following table (Sessions
) in postgres:
SessionID | Other columns | Statics |
---|---|---|
BDzQDeGr | * | {"excludedOptions": {"questID": ["optID", "optID2"]}} |
Note: Statics is a jsonb.
I wanted to execute a query that append optID
to "Statics"->'excludedOptions'->'questID'
jsonb array if questID
exists else append questID
to excludedOptions
that has ["optID"]
as value.
I spent my last 4 days trying to do that with one query, And finally I did it ...
WITH ex AS
(
SELECT "Statics" -> 'excludedOptions'
FROM "Sessions"
WHERE "SessionID" = '{data["sessionId"]}'
)
UPDATE "Sessions"
SET "Statics" = jsonb_set(
"Statics",
ARRAY['excludedOptions', '{data["questionId"]}'],
array_to_json(array_remove((
SELECT
ARRAY['{data["answerId"]}'] || coalesce((
SELECT
jsonb_array_to_text_array((
SELECT value
FROM jsonb_each(( SELECT ex.* FROM ex))
WHERE key = '{data["questionId"]}'
))
), null)
), null))::jsonb,
TRUE
) WHERE "SessionID" = '{data["sessionId"]}'
NOTE: SQL above is a python f-string
but when I measured the executing time it was about 300-600ms witch is a lot of time for this simple operation.
I wanted it to be one query cz somehow I believe that one query is better :).
Thank you for spending a little of your time for me
Table Schema:
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description |
---|---|---|---|---|---|---|---|
ID | numeric(16,0) | not null | main | ||||
SessionID | character varying(8) | not null | extended | ||||
Title | text | not null | extended | ||||
Type | text | not null | extended | ||||
Source | text | not null | extended | ||||
Questions | character(6)[] | not null | extended | ||||
Skipped | character(6)[] | '{}'::bpchar[] | extended | ||||
Answered | json | '{}'::json | extended | ||||
UsedHints | character(6)[] | '{}'::bpchar[] | extended | ||||
FinalResult | smallint | 0 | plain | ||||
Statics | jsonb | '{"timer":{}, "excludedOptions":{}}'::json | extended | ||||
Marked | character(6)[] | '{}'::bpchar[] | extended | ||||
Time | integer | not null | 0 | plain |
EXPLAIN ANALYZE:
Seq Scan on "Sessions" (cost=0.00..1.00 rows=1 width=894) (actual time=0.016..0.017 rows=1 loops=1)
Planning Time: 0.037 ms
Execution Time: 0.030 ms
Your table has no indexes, so Postgresql must scan the whole table to find anything. That's what a "Seq Scan" is. You need to add some indexes to avoid table scans and make queries efficient.
Since you're checking WHERE "SessionID" = ...
an index on SessionID would allow Postgres to find all matching rows without having to scan the whole table.
See Use The Index, Luke for more about SQL performance tuning and indexes.
Other notes:
char
, there is no benefit and many drawbacks. Use varchar
or text
.json
type. jsonb
is more efficient. Especially don't mix json
and jsonb
, it just invites confusion.numeric(16,0)
is an odd choice for a primary key. A simple bigint
will use less space, be more efficient, and store a larger number. PostgreSQL even provides the convenient bigserial
type for auto-incrementing primary keys.time
. Use the "at" and "on" conventions for timestamps and dates. For example, created_at
or created_on
.