Search code examples
mysqlsqlsql-order-byinnodb

Why is my case-insensitive order by failing to sort results properly?


I'm using MySql 5.6.15 on Amazon Linux. I'm trying to write a simple query to get results ordered by the lower-cased version of my table's NAME column, which is of type VARCHAR(100). The table has attributes

ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin

But notice I'm getting different results and the only thing that is different is that in the incorrect results, below, I'm getting one extra column ...

mysql> select NAME, ACTIVE, ADDRESS_ID, COUNTRY_ID, CREATED_ON, ORGANIZATION_ID, IMPORT_ADMIN_DATA_FROM_SIS, IMPORT_DATA_FROM_SIS, MDR_NUMBER from organization order by lower(NAME);
+--------------------------------------+--------+------------+------------+---------------------+-----------------+----------------------------+----------------------+------------+
| NAME                                 | ACTIVE | ADDRESS_ID | COUNTRY_ID | CREATED_ON          | ORGANIZATION_ID | IMPORT_ADMIN_DATA_FROM_SIS | IMPORT_DATA_FROM_SIS | MDR_NUMBER |
+--------------------------------------+--------+------------+------------+---------------------+-----------------+----------------------------+----------------------+------------+
| Billy Madison Elementary             |       | NULL       | US         | 2018-11-29 22:35:57 | 7788            |                            |                    0 | NULL       |
| Bradley County Schools               |       | NULL       | US         | 2018-11-29 22:35:57 | 8888            |                            |                    0 | NULL       |
| Billy Madison Elementary             |       | NULL       | US         | 2018-11-29 22:35:57 | 9998            |                            |                    0 | NULL       |

When I don't request that extra column, I get the correct results ...

mysql> select NAME, ACTIVE, ADDRESS_ID, COUNTRY_ID, CREATED_ON, ORGANIZATION_ID, IMPORT_ADMIN_DATA_FROM_SIS, IMPORT_DATA_FROM_SIS from organization order by lower(NAME);
+--------------------------------------+--------+------------+------------+---------------------+-----------------+----------------------------+----------------------+
| NAME                                 | ACTIVE | ADDRESS_ID | COUNTRY_ID | CREATED_ON          | ORGANIZATION_ID | IMPORT_ADMIN_DATA_FROM_SIS | IMPORT_DATA_FROM_SIS |
+--------------------------------------+--------+------------+------------+---------------------+-----------------+----------------------------+----------------------+
| Billy Madison Elementary             |       | NULL       | US         | 2018-11-29 22:35:57 | 9998            |                            |                    0 |
| Billy Madison Elementary             |       | NULL       | US         | 2018-11-29 22:35:57 | 7788            |                            |                    0 |
| Bradley County Schools               |       | NULL       | US         | 2018-11-29 22:35:57 | 8888            |                            |                    0 |

What the hell is going on here? How do I return results ordered by the lower-cased name?

Edit: Create table statement from running "SHOW CREATE TABLE" ...

| organization | CREATE TABLE `organization` (
  `ID` varchar(32) COLLATE utf8_bin NOT NULL,
  `STATE_ID` varchar(10) COLLATE utf8_bin DEFAULT NULL,
  `ORGANIZATION_ID` varchar(32) COLLATE utf8_bin NOT NULL,
  `COUNTRY_ID` varchar(10) COLLATE utf8_bin NOT NULL,
  `NAME` varchar(100) COLLATE utf8_bin NOT NULL,
  `ORGANIZATION_TYPE_ID` varchar(2) COLLATE utf8_bin NOT NULL,
  `PARENT_ORGANIZATION_ID` varchar(32) COLLATE utf8_bin DEFAULT NULL,
  `USER_ENTERED` tinyint(4) DEFAULT '0',
  `SAMPLE_ORGANIZATION` tinyint(4) DEFAULT '0',
  `IMPORT_DATA_FROM_SIS` tinyint(1) NOT NULL DEFAULT '0',
  `USE_EXTERNAL_AUTHENTICATION` tinyint(1) NOT NULL DEFAULT '0',
  `ADDRESS_ID` varchar(32) COLLATE utf8_bin DEFAULT NULL,
  `LTI_REFERER_DOMAIN` varchar(32) COLLATE utf8_bin DEFAULT NULL,
  `URL_ID` varchar(32) COLLATE utf8_bin DEFAULT NULL,
  `CREATED_ON` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
  `MDR_NUMBER` varchar(32) COLLATE utf8_bin DEFAULT NULL,
  `ACTIVE` bit(1) DEFAULT b'1',
  `IMPORT_ADMIN_DATA_FROM_SIS` bit(1) DEFAULT b'0',
  `USE_EXTERNAL_AUTH_FOR_ADMINS` bit(1) DEFAULT b'0',
  PRIMARY KEY (`ID`),
  UNIQUE KEY `UK_ORGANIZATION` (`ORGANIZATION_ID`),
  KEY `FK1_ORGANIZATION` (`COUNTRY_ID`),
  KEY `FK3_ORGANIZATION` (`ORGANIZATION_TYPE_ID`),
  KEY `FK2_ORGANIZATION` (`PARENT_ORGANIZATION_ID`),
  KEY `FK_ORGANIZATION` (`ADDRESS_ID`),
  KEY `FK5_ORGANIZATION` (`URL_ID`),
  CONSTRAINT `FK1_ORGANIZATION` FOREIGN KEY (`COUNTRY_ID`) REFERENCES `cb_country` (`ID`) ON DELETE CASCADE ON UPDATE NO ACTION,
  CONSTRAINT `FK2_ORGANIZATION` FOREIGN KEY (`PARENT_ORGANIZATION_ID`) REFERENCES `organization` (`ID`) ON DELETE CASCADE ON UPDATE NO ACTION,
  CONSTRAINT `FK3_ORGANIZATION` FOREIGN KEY (`ORGANIZATION_TYPE_ID`) REFERENCES `cb_org_type` (`ID`) ON DELETE CASCADE ON UPDATE NO ACTION,
  CONSTRAINT `FK5_ORGANIZATION` FOREIGN KEY (`URL_ID`) REFERENCES `sb_url` (`ID`) ON UPDATE NO ACTION,
  CONSTRAINT `FK_ORGANIZATION` FOREIGN KEY (`ADDRESS_ID`) REFERENCES `cb_address` (`ID`) ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin |

Solution

  • The SQL you posted is correct. I created a SQL fiddle to try to reproduce your problem using the data in your question, but it works fine. Comment out the order by clause and you get the bad result you show in your question.

    Sometimes UTF8 strings look identical but are not. Have you verified that

    select * from organization where name = "Billy Madison Elementary"
    

    returns the 2 rows you expect?

    If the NAMEs are identical, there is no kind of sorting that would not put them next to each other. The only sort on NAME that is consistent with the results you see is one where all 3 rows have identical NAME values. You might run into this sort of thing in some context where strings are all converted to integer values and all the names convert to zero, but even then I would expect the ordering of the results to stay consistent between the 2 queries in your question.

    There are number of obscure ways a query can be truncated, and I would guess that the most likely explanation of what you are seeing is that the order by clause in the first query is not making it to the server. How are you issuing the query? Have you tried using a (different) interactive way to run the query? What if you keep the MDR_NUMBER column but omit the IMPORT_ADMIN_DATA_FROM_SIS column in your query?

    Can you reproduce the problem using sqlfiddle.com? If so, please add the link to your question.