Search code examples
mysqlsql-order-bymysql-cluster

MySQL query results returned are semi-random / inconsistently ordered


I'm working with an ndb cluster setup that uses proxysql. There are 4 mysql servers, 4 data nodes, and 2 management nodes. The following happens when I access one of the mysql servers directly, so I think that I can safely rule out proxysql as the root cause, but beyond that I'm just lost.

Here's a table I set up to help illustrate my problem:

mysql> describe delain;
+----------+-------------+------+-----+---------+----------------+
| Field    | Type        | Null | Key | Default | Extra          |
+----------+-------------+------+-----+---------+----------------+
| album_id | tinyint(2)  | NO   | PRI | NULL    | auto_increment |
| album    | varchar(30) | YES  |     | NULL    |                |
+----------+-------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)

It contains the following data; note that I specified an order by clause:

mysql> select * from delain order by album_id;
+----------+-------------------------+
| album_id | album                   |
+----------+-------------------------+
|        1 | Lucidity                |
|        2 | April Rain              |
|        3 | We Are the Others       |
|        4 | The Human Contradiction |
|        5 | Moonbathers             |
+----------+-------------------------+
5 rows in set (0.00 sec)

If I don't specify an order clause, the results returned are seemingly random, such as this:

mysql> select * from delain;
+----------+-------------------------+
| album_id | album                   |
+----------+-------------------------+
|        3 | We Are the Others       |
|        5 | Moonbathers             |
|        1 | Lucidity                |
|        2 | April Rain              |
|        4 | The Human Contradiction |
+----------+-------------------------+
5 rows in set (0.00 sec)

When I repeat the query (sans order clause) I get a different ordering pretty much every time. It doesn't seem to be truly random, but there sure as heck isn't any sort of discernible pattern to me.

Why is this happening? My experience with mysql has always been that the default ordering is essentially according to the primary key, but this is also the first time I've used an ndb cluster in particular; I don't know if there's a difference there, or if there's a setting inside a config file that got missed or what. Any help is greatly appreciated!


Solution

  • This is standard SQL behavior.

    https://mariadb.com/kb/en/library/sql-99/order-by-clause/ says in part:

    An ORDER BY clause may optionally appear after a query expression: it specifies the order rows should have when returned from that query (if you omit the clause, your DBMS will return the rows in some random order).

    (emphasis mine)

    It'd be more accurate to say it will return the rows in some arbitrary order, instead of random order. Random implies that the order will change from one execution to the next.

    • In the case of InnoDB, the order tends to be the index order in which the rows were accessed. The index it reads is not necessarily the primary key. So the order is unchanging and somewhat predictable if you know something about the internals. But it's not random.

    • In the case of MyISAM, the order tends to be the order the rows are stored in the table, which can vary depending on the order the rows were inserted, and also depending on where there was space in the file at the time of insertion, after row deletions.

    • In the case of NDB, I don't know as much about its internals, so I can't describe its rule for "default" order, but it's still true that without an explicit ORDER BY, the storage engine is allowed to return rows in whatever order it wants to.