Search code examples
sqlmysqlunion

Why does UNION return less rows than one of its subrequests with MySQL?


UNION can be used to merge the rows of two sub-requests:

SELECT column_name FROM table1
UNION
SELECT column_name FROM table2;

By default, MySQL's UNION is actually UNION DISTINCT, which means that the rows table1 and table2 have in common are de-duplicated. For example:

SELECT * FROM table_1;

--------
| name |
--------
| John |
| Mary |
--------

SELECT * FROM table_2;

---------
| name  |
---------
| John  |
| Steve |
---------

SELECT * FROM table1
UNION
SELECT * FROM table2;

---------
| name  |
---------
| John  |
| Mary  |
| Steve |
---------

Note that John is present only once in the final result, because MySQL de-duplicate this row.

We can say that UnionRows.length <= Table1Rows.length + Table2Rows.length.

However, I'm facing an unexpected behavior. I have UnionRows.length < Table1Rows.length. Which means that when I merge two sets, I get less rows than just taking one of the set. I would expect to have UnionRows.length >= Table1Rows.length and UnionRows.length >= Table2Rows.length.


Solution

  • This is because UNION also de-duplicates rows in each sub-request. For example:

    SELECT * FROM table_1;
    
    --------
    | name |
    --------
    | John |
    | John |
    | John |
    --------
    
    SELECT * FROM table_2;
    
    ---------
    | name  |
    ---------
    | Steve |
    ---------
    
    SELECT * FROM table1
    UNION
    SELECT * FROM table2;
    
    ---------
    | name  |
    ---------
    | John  |
    | Steve |
    ---------
    

    Note that the final result has less rows than table_1.