System: MariaDB 10.3.15, python 3.7.2, mysql.connector python package
I'm having trouble to determine to the exact cause of a problem, possibly a bug in MariaDB/mySQL, when executing the query with the table structure as described below. The confusing part is the error message
1356 (HY000): View 'test_project.denormalized' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them
which seems to relate to the problem at first, but the further I dig into why this is happening, the more I get the feeling this error message is a red herring.
Steps to reproduce:
CREATE DATABASE `test_project`;
USE `test_project`;
CREATE TABLE `normalized` (
`id` INT NOT NULL AUTO_INCREMENT,
`foreign_key` INT NOT NULL,
`name` VARCHAR(45) NOT NULL,
`value` VARCHAR(45) NULL,
PRIMARY KEY (`id`));
INSERT INTO `normalized` (`foreign_key`, `name`, `value`) VALUES
(1, 'attr_1', '1'),
(1, 'attr_2', '2'),
(2, 'attr_1', '3'),
(2, 'attr_2', '4');
CREATE OR REPLACE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `denormalized` AS
select
max(`iq`.`foreign_key`) AS `foreign_key`,
max(`iq`.`attr_1`) AS `attribute_1`,
max(`iq`.`attr_2`) AS `attribute_2`
from (
select
`foreign_key` AS `foreign_key`,
if(`name` = 'attr_1',`value`,NULL) AS `attr_1`,
if(`name` = 'attr_2',`value`,NULL) AS `attr_2`
from `normalized`
) as `iq`
group by `iq`.`foreign_key`;
Using python connect to the database and execute the following query:
conn = mysql.connector.connect(host="somehost", user="someuser", password="somepassword")
cursor = conn.cursor()
query = """select * from denormalized as d
where d.`foreign_key` in
(
SELECT distinct(foreign_key)
FROM normalized
where value = %s
);"""
cursor.execute(query, ["2"])
results = cursors.fetchall()
Further information: At first I thought that obviously it's a privilege issue, but even using root for everything and double checking hosts and specific privileges didn't change anything.
Then I dug deeper into what the queries and views involved do (the test case above is a reduced version of what's actually in our database) and tested each part. Selecting from the view works. Running the query of the view works. Selecting from the view with a static subquery works. In fact, replacing the view in the problematic query with it's definition works too.
I've boiled it down to selecting from the view using a subquery in the where
clause using parameters in that subquery. This causes the error to appear. Using a static subquery or replacing the view with it's definition works just fine, it's only this specific circumstance where it fails.
And I have no idea why.
The group by does not make sense; did you really mean one of these?
This returns one row:
select max(`foreign_key`) AS `foreign_key`,
max(if(`name` = 'attr_1', `value`,NULL)) AS `attribute_1`,
max(if(`name` = 'attr_2', `value`,NULL)) AS `attribute_2`
from `normalized`;
This uses the GROUP BY
and returns one row per foreign_key
:
select `foreign_key`,
max(if(`name` = 'attr_1', `value`,NULL)) AS `attribute_1`,
max(if(`name` = 'attr_2', `value`,NULL)) AS `attribute_2`
from `normalized`
group by `foreign_key`;
Your python query is probably better in either of these formulations:
select d.*
FROM ( SELECT distinct(foreign_key)
FROM normalized
where
value = %s )
JOIN denormalized as d;
select d.*
FROM denormalized as d
WHERE EXISTS ( SELECT 1
FROM normalized
where foreign_key = d.foreign_key
AND value = %s )
They would benefit from INDEX(value, foreign_key)
.