I have database that is equivalent to following table:
id | foo | bar
---+------+-----
1 | 5 | 6
2 | 7 | NULL
but, unfortunately, implemented as Entity-Attribute-Value:
CREATE TABLE obj(id INTEGER NOT NULL PRIMARY KEY);
CREATE TABLE attrdef(id INTEGER NOT NULL PRIMARY KEY, name VARCHAR(4));
CREATE TABLE attr(obj_id INTEGER NOT NULL, attrdef_id INTEGER NOT NULL, value INTEGER NOT NULL);
INSERT INTO obj VALUES(1);
INSERT INTO obj VALUES(2);
INSERT INTO attrdef VALUES(3, 'foo');
INSERT INTO attrdef VALUES(4, 'bar');
INSERT INTO attr VALUES(1,3,5);
INSERT INTO attr VALUES(1,4,6);
INSERT INTO attr VALUES(2,3,7);
I need to query that database to get data in "proper" form - like in example table. I have tried:
SELECT obj.id, foo.value, bar.value
FROM obj
LEFT JOIN attr foo ON (obj.id = foo.obj_id)
LEFT JOIN attrdef foo_def ON (foo.attrdef_id = foo_def.id)
LEFT JOIN attr bar ON (obj.id = bar.obj_id)
LEFT JOIN attrdef bar_def ON (bar.attrdef_id = bar_def.id)
WHERE foo_def.name = 'foo' AND bar_def.name = 'bar';
but the second row is missing:
id | foo | bar
---+------+-----
1 | 5 | 6
and
SELECT obj.id,
MAX(CASE WHEN name='foo' THEN value ELSE NULL END) foo,
MAX(CASE WHEN name='bar' THEN value ELSE NULL END) bar
FROM obj LEFT JOIN attr ON (obj.id = attr.obj_id)
LEFT JOIN attrdef ON (attr.attrdef_id = attrdef.id)
GROUP BY obj.id;
which gives correct result:
id | foo | bar
---+------+-----
1 | 5 | 6
2 | 7 | NULL
but performance of this query is unacceptable.
I'd like standard SQL query, but good MySQL-specific solution would be appreciated.
You just need to move the conditions to the on
clauses:
SELECT obj.id, foo.value, bar.value
FROM obj LEFT JOIN
attr foo
ON obj.id = foo.obj_id LEFT JOIN
attrdef foo_def
ON foo.attrdef_id = foo_def.id AND foo_def.name = 'foo' LEFT JOIN
attr bar
ON obj.id = bar.obj_id LEFT JOIN
attrdef bar_def
ON bar.attrdef_id = bar_def.id AND bar_def.name = 'bar';
For the aggregation method, I would go with:
SELECT obj.id,
MAX(CASE WHEN name = 'foo' THEN value END) foo,
MAX(CASE WHEN name = 'bar' THEN value END) bar
FROM obj LEFT JOIN
attr
ON obj.id = attr.obj_id LEFT JOIN
attrdef
ON attr.attrdef_id = attrdef.id
WHERE name IN ('foo', 'bar')
GROUP BY obj.id;
The left join
s might not be necessary in this case (depending on the distribution of missing values). In any case, if you start looking at more and more attributes, the JOIN
method takes longer and longer. The GROUP BY
method has about the same performance.
EDIT:
The correct query is:
SELECT obj.id, foo.value, bar.value
FROM obj LEFT JOIN
(attr foo JOIN
attrdef foo_def
ON foo.attrdef_id = foo_def.id AND foo_def.name = 'foo'
)
ON obj.id = foo.obj_id LEFT JOIN
(attr bar JOIN
attrdef bar_def
ON bar.attrdef_id = bar_def.id AND bar_def.name = 'bar'
)
ON obj.id = bar.obj_id ;
Here is the SQL Fiddle.