Postgresql 14.
Hi! I have multiple tables that are named like model_<uuid>
and each table has columns that are named like element_<uuid>
. All of these tables are in models
schema. And I have query that transform information_schema.columns
view in something like m2m table.
SELECT
SPLIT_PART(table_name, '_', 2)::UUID AS model_id,
SPLIT_PART(column_name, '_', 2)::UUID AS element_id
FROM information_schema.columns
WHERE
table_schema = 'models' AND
table_name LIKE 'model\_%' AND
column_name LIKE 'element\_%'
It works properly. But when I try to wrap it in a view or use it as a subquery, I get the following error.
Subquery example:
SELECT model_id, ARRAY_AGG(element_id) AS element_ids
FROM (
SELECT
SPLIT_PART(table_name, '_', 2)::UUID AS model_id,
SPLIT_PART(column_name, '_', 2)::UUID AS element_id
FROM information_schema.columns
WHERE
table_schema = 'models' AND
table_name LIKE 'model\_%' AND
column_name LIKE 'element\_%'
) AS t
WHERE model_id = '2ba221e1-f2e3-4438-bafb-8e7bfefe161f'
GROUP BY model_id
Error:
ERROR: invalid input syntax for type uuid: "elements"
I have table elements
but it is in public
schema and I have table_schema = 'models'
filter. I think that this filter does not work and subquery tries to cast it to 'uuid'. But when I run the subquery as a normal (root?) query, all rows are taken from the models
schema.
And when I remove WHERE model_id = '2ba221e1-f2e3-4438-bafb-8e7bfefe161f'
it's working properly again.
What is wrong? It literally blows my mind.
UPD:
Okay. I understand the reason. The reason for the error is that the query optimizer moves the query filter model_id = '...'
inside the subquery filters. Error was raised on public.model_elements
view.
I remove all UUIDs in table_name
from query and run this query:
EXPLAIN SELECT model_id, ARRAY_AGG(element_id)
FROM (
SELECT
table_name AS model_id,
SPLIT_PART(column_name, '_', 2)::UUID AS element_id
FROM information_schema.columns
WHERE
table_schema = 'models'
AND table_name LIKE 'model\_%'
AND column_name LIKE 'element\_%'
) AS t
WHERE model_id = 'abc'
GROUP BY model_id;
Fragment from query plan:
-> Index Scan using pg_class_relname_nsp_index on pg_class c (cost=0.28..4.31 rows=1 width=76)
Index Cond: ((relname >= 'model_'::text) AND (relname < 'model`'::text) AND (relname = 'abc'::name))
Filter: ((relname ~~ 'model\_%'::text) AND (relkind = ANY ('{r,v,f,p}'::""char""[])))"
-> Index Scan using pg_attribute_relid_attnam_index on pg_attribute a (cost=0.29..4.32 rows=1 width=78)
Index Cond: ((attrelid = c.oid) AND (attname >= 'element_'::text) AND (attname < 'element`'::text))
Filter: ((NOT attisdropped) AND (attnum > 0) AND (attname ~~ 'element\_%'::text) AND (pg_has_role(c.relowner, 'USAGE'::text) OR has_column_privilege(c.oid, attnum, 'SELECT, INSERT, UPDATE, REFERENCES'::text)))"
As you can see, it has combined all filters (root filters and subquery filters) by table_name,
including the SPLIT_PART(table_name, '_', 2)::UUID
transformation (if we consider the original query) and uses them first. That is, first it tries to convert all table names into UUIDs, ignoring those filters that filter out tables for which it can't be done.
I found the problem but how to fix it?
PostgreSQL has to calculate SPLIT_PART(table_name, '_', 2)::UUID
before it can compare it to '2ba221e1-f2e3-4438-bafb-8e7bfefe161f'
? Simply avoid the cast to uuid
and compare strings.
Also, PostgreSQL is free to rearrange the query as long as it remains semantically equivalent. In the case at hand, it pulls up the subquery and evaluates the index conditions first. If you want to prevent PostgreSQL from pulling up the subquery, add OFFSET 0
at the end of the subquery. That makes no semantic difference, but is an optimizer barrier.