Search code examples
mysqlmariadbmysql-connector-python

Can't use parameters in subquery when selecting from view


System: MariaDB 10.3.15, python 3.7.2, mysql.connector python package

I'm having trouble to determine to the exact cause of a problem, possibly a bug in MariaDB/mySQL, when executing the query with the table structure as described below. The confusing part is the error message

1356 (HY000): View 'test_project.denormalized' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them

which seems to relate to the problem at first, but the further I dig into why this is happening, the more I get the feeling this error message is a red herring.

Steps to reproduce:

CREATE DATABASE `test_project`;

USE `test_project`;

CREATE TABLE `normalized` (
  `id` INT NOT NULL AUTO_INCREMENT,
  `foreign_key` INT NOT NULL,
  `name` VARCHAR(45) NOT NULL,
  `value` VARCHAR(45) NULL,
  PRIMARY KEY (`id`));

INSERT INTO `normalized` (`foreign_key`, `name`, `value`) VALUES
(1, 'attr_1', '1'),
(1, 'attr_2', '2'),
(2, 'attr_1', '3'),
(2, 'attr_2', '4');

CREATE OR REPLACE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `denormalized` AS 
select
    max(`iq`.`foreign_key`) AS `foreign_key`,
    max(`iq`.`attr_1`) AS `attribute_1`,
    max(`iq`.`attr_2`) AS `attribute_2`
from (
    select
        `foreign_key` AS `foreign_key`,
        if(`name` = 'attr_1',`value`,NULL) AS `attr_1`,
        if(`name` = 'attr_2',`value`,NULL) AS `attr_2`
    from `normalized`
) as `iq`
group by `iq`.`foreign_key`;

Using python connect to the database and execute the following query:

conn = mysql.connector.connect(host="somehost", user="someuser", password="somepassword")
cursor = conn.cursor()
query = """select * from denormalized as d
where d.`foreign_key` in
(
    SELECT distinct(foreign_key)
    FROM normalized
    where value = %s
);"""
cursor.execute(query, ["2"])
results = cursors.fetchall()

Further information: At first I thought that obviously it's a privilege issue, but even using root for everything and double checking hosts and specific privileges didn't change anything.

Then I dug deeper into what the queries and views involved do (the test case above is a reduced version of what's actually in our database) and tested each part. Selecting from the view works. Running the query of the view works. Selecting from the view with a static subquery works. In fact, replacing the view in the problematic query with it's definition works too.

I've boiled it down to selecting from the view using a subquery in the where clause using parameters in that subquery. This causes the error to appear. Using a static subquery or replacing the view with it's definition works just fine, it's only this specific circumstance where it fails.

And I have no idea why.


Solution

  • The group by does not make sense; did you really mean one of these?

    This returns one row:

    select  max(`foreign_key`) AS `foreign_key`,
            max(if(`name` = 'attr_1', `value`,NULL)) AS `attribute_1`,
            max(if(`name` = 'attr_2', `value`,NULL)) AS `attribute_2`
        from  `normalized`;
    

    This uses the GROUP BY and returns one row per foreign_key:

    select  `foreign_key`,
            max(if(`name` = 'attr_1', `value`,NULL)) AS `attribute_1`,
            max(if(`name` = 'attr_2', `value`,NULL)) AS `attribute_2`
        from  `normalized` 
        group by  `foreign_key`;
    

    Your python query is probably better in either of these formulations:

    select  d.*
        FROM ( SELECT  distinct(foreign_key)
                FROM  normalized
                where
                 value  = %s  )
        JOIN  denormalized as d;
    
    select  d.*
        FROM denormalized as d
        WHERE EXISTS ( SELECT 1
                FROM  normalized
                where foreign_key = d.foreign_key
                  AND value  = %s  )
    

    They would benefit from INDEX(value, foreign_key).