Boolean operations on mysql results

I have 3 Mysql tables:

[block_value]

id_block_value
file_id

[metadata]

id_metadata
metadata_name

[metadata_value]

meta_id
value
blockvalue_id

In these tables, there are pairs: metadata_name = value And list of pairs are put in blocks (id_block_value)

(A) If I want height = 1080:

SELECT DISTINCT file_id 
FROM metadata_value MV 
     INNER JOIN metadata M ON MV.meta_id = M.id_metadata 
     INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value 
WHERE (metadata_name = "height" and value = "1080");

+---------+
| file_id |
+---------+
|      21 | 
|      22 |
(...)
|    6962 |
(...)
|    8146 | 
|    8147 | 
+---------+
794 rows in set (0.06 sec)

(B) If I want file extension = mpeg:

SELECT DISTINCT file_id 
FROM metadata_value MV 
     INNER JOIN metadata M ON MV.meta_id = M.id_metadata 
     INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value 
WHERE (metadata_name = "file extension" and value = "mpeg");

+---------+
| file_id |
+---------+
|    6889 | 
|    6898 | 
|    6962 | 
+---------+
3 rows in set (0.06 sec)

BUT, if I want:

A and B
A or B
A and not B

Then, I don't know what is the best.

For A or B, I tried A union B which seems to do the trick.

SELECT DISTINCT file_id 
FROM metadata_value MV 
     INNER JOIN metadata M ON MV.meta_id = M.id_metadata 
     INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value 
WHERE (metadata_name = "height" and value = "1080")
UNION
SELECT DISTINCT file_id 
FROM metadata_value MV 
     INNER JOIN metadata M ON MV.meta_id = M.id_metadata 
     INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value 
WHERE (metadata_name = "file extension" and value = "mpeg");
+---------+
| file_id |
+---------+
|      21 | 
|      22 | 
|      34 |
(...)
|    6889 | 
|    6898 | 
+---------+
796 rows in set (0.13 sec)

For A and B, since there are no intersect in Mysql, I tried A and file_id in(B), but look at perfs (>4mn)...

SELECT DISTINCT file_id 
FROM metadata_value MV 
     INNER JOIN metadata M ON MV.meta_id = M.id_metadata 
     INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value 
WHERE (metadata_name = "height" and value = "1080")
and file_id in(
SELECT DISTINCT file_id 
FROM metadata_value MV 
     INNER JOIN metadata M ON MV.meta_id = M.id_metadata 
     INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value 
WHERE (metadata_name = "file extension" and value = "mpeg"));

+---------+
| file_id |
+---------+
|    6962 | 
+---------+
1 row in set (4 min 36.22 sec)

I tried B and file_id in(A) too, which is a lot better, but I will never know how which one to put first.

SELECT DISTINCT file_id 
FROM metadata_value MV 
     INNER JOIN metadata M ON MV.meta_id = M.id_metadata 
     INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value 
WHERE (metadata_name = "file extension" and value = "mpeg")
and file_id in(
SELECT DISTINCT file_id 
FROM metadata_value MV 
     INNER JOIN metadata M ON MV.meta_id = M.id_metadata 
     INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value 
WHERE (metadata_name = "height" and value = "1080"));

+---------+
| file_id |
+---------+
|    6962 | 
+---------+
1 row in set (0.75 sec)

So... what do I do now? Is there any better way for boolean operations? Any tip? Did I miss something?

EDIT: what data looks like:

This database contains a row in FILE table for each audio/video file inserted:

10, /path/to/file.ts
11, /path/to/file2.mpeg

There is a row in METADATA table for each potential information:

301, height
302, file extension

Then, a row in BLOCK table define a container:

101, Video
102, Audio
104, General

A file can have several blocks of metadata, a BLOCK_VALUE table contains instances of BLOCKS:

402, 101, 10 // Video 1
403, 101, 10 // Video 2
404, 101, 10 // Video 3
405, 102, 10 // Audio
406, 104, 10 // General

In this example, file 10 has 5 blocks: 3 Video (101) + 1 Audio (102) + 1 General (104)

Values are stored in METADATA_VALUE

302, 406, "ts" // file extension, General
301, 402, "1080" // height, Video 1
301, 403, "720" // height, Video 2
301, 404, "352" // height, Video 3

Solution

I'm opening a new post only to keep the "correct" solution tidy..

Ok, sorry, it seemed that I was making the wrong assumption. I never thought about two blocks being defined exactly the same way.

So, since I'm a copycat, and I like my getting the AND from OR solution (:P), I got to these two solutions..

ORing: I like Chris's solution better...

SELECT DISTINCT file_id 
  FROM metadata_value MV 
    INNER JOIN metadata M ON MV.meta_id = M.id_metadata 
    INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value 
   WHERE (metadata_name = "height" and value = "1080") 
      OR (metadata_name = "file extension" and value = "mpeg")

ANDing: I'll use your ORing version (the one with the UNION all

  SELECT FILE_ID FROM (
     SELECT DISTINCT 1, file_id 
             FROM metadata_value MV 
       INNER JOIN metadata M ON MV.meta_id = M.id_metadata 
       INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value 
              WHERE (metadata_name = "height" and value = "1080")
     UNION ALL
     SELECT DISTINCT 2, file_id 
             FROM metadata_value MV 
       INNER JOIN metadata M ON MV.meta_id = M.id_metadata 
       INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value 
              WHERE (metadata_name = "file extension" and value = "mpeg")
   ) IHATEAND
   GROUP BY FILE_ID
   HAVING COUNT(1)>1

Which gives:

+---------+
| FILE_ID |
+---------+
|    6962 |
+---------+
1 row in set (0.24 sec)

it should be a little less fast than the ORing seeing the performances you pasted and mines (I am 3 times as slow, time to upgrade -.-), but still significantly faster than the previous queries ;)

Anyway, how does the ANDing work? Put pretty simply, it just does the two separate queries and names the records according to the branch they come from, then counts the different file ids coming from them

UPDATE: another way of doing it without having to "name" the branches:

SELECT FILE_ID FROM (
    SELECT file_id 
        FROM metadata_value MV 
        INNER JOIN metadata M ON MV.meta_id = M.id_metadata 
        INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value 
            WHERE (metadata_name = "height" and value = "1080")
    GROUP BY FILE_ID
    UNION ALL
    SELECT file_id 
        FROM metadata_value MV 
        INNER JOIN metadata M ON MV.meta_id = M.id_metadata 
        INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value 
    WHERE (metadata_name = "file extension" and value = "mpeg")
    GROUP BY FILE_ID
    ) IHATEAND
GROUP BY FILE_ID
HAVING COUNT(1)>1

Here the results are the same (and performances as well) and I'm exploiting the fact that while UNION automatically sorts the duplicates and removes the duplicates, UNION ALL does not... which is perfect since I don't want them removed (and in general union all is also faster than union :) ), this way I can forget about naming.