mysql> show columns in m like 'fld';
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| fld | varchar(45) | YES | MUL | NULL | |
+-------+-------------+------+-----+---------+-------+
1 row in set (0.02 sec)
mysql> show columns in i like 'fld';
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| fld | varchar(45) | YES | MUL | NULL | |
+-------+-------------+------+-----+---------+-------+
1 row in set (0.02 sec)
mysql> select count(distinct fld) from i;
+---------------------+
| count(distinct fld) |
+---------------------+
| 27988 |
+---------------------+
1 row in set (0.03 sec)
mysql> select count(distinct fld) from m;
+---------------------+
| count(distinct fld) |
+---------------------+
| 72558 |
+---------------------+
1 row in set (0.07 sec)
The above results seem reasonable based on what I know of the tables in question.
mysql> select count(*) from m where fld not in (select fld from i);
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (0.11 sec)
That last result does not seem reasonable. There must be some rows in m
with fld
not in i
! Can someone please explain why I get 0
as the result?
For completeness (because I suspect it may be relevant), I'll also paste this result:
mysql> select count(*) from m where fld is null;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (0.00 sec)
Edit: In reply to comments, I'm editing in the following info also, in case it helps someone answer my question:
select count(*) from m join i using (fld)
yields 9350; with left join
, 73087; with right join
, 28872.select count(*) from i where fld is null
yields 810.I've now figured it out. The documentation reads:
To comply with the SQL standard,
IN
returnsNULL
not only if the expression on the left hand side isNULL
, but also if no match is found in the list and one of the expressions in the list isNULL
.
Thus, fld not in
(i.e. not fld in
) returns not null
(i.e. null
) whenever there's a null
in the derived table (which there is here) and fld
isn't found in the derived table (which is what I'm testing for).
There may be a better workaround than this, but I'm using (select fld from i where fld is not null)
.