Search code examples
mysqlmariadb-10.5

MariaDB crashes when using large WHERE IN


I have an old project that I am trying to upgrade to a newer version of SQL. It is currently running MySql 5.5 and is working just fine. I have migrated som test data to MariaDB 10.5.9, but when I try running a query (which works fine on MySql 5.5), MariaDB crashes.

The query is quite big, and uses WHERE IN extensively. Currently I am unable to refactor the query sadly, so I am instead trying to figure out what causes the crash.

It has 3 WHERE IN. First is 24 items, second is 696 and third is 2. If I remove just one item from either the first or second WHERE IN, it works are returns data instantly. The answers table is MyISAM

The error I am getting

SQL Error [08S01]: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

And the query

SELECT
    definition_id,
    answer AS value
FROM
    answers
WHERE
    definition_id IN (...)
    AND respondent_id in (...)
    AND context IN (1, 0)
LIMIT 50

I have already tried changing max_allowed_packet to something higher (it was 16MB in 5.5), but it sadly changes nothing.

Result of EXPLAIN SQL_NO_CACHE (if I remove alot of the data in the WHERE IN to avoid crash)

id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY ALL 824 Using temporary; Using filesort
2 DERIVED s1 range definition_respondent_context,respondent_id definition_respondent_context 12 824 Using index condition; Using temporary; Using filesort
2 DERIVED s2 eq_ref definition_respondent_context,respondent_id definition_respondent_context 12 const,database_name.s1.respondent_id,const 1

EDIT: I managed to get it working by using a join on the definitions table

SELECT
    a.definition_id,
    a.answer AS value
FROM
    answers AS a
JOIN definitions AS d ON a.definition_id = d.id
WHERE
    d.id IN (...)
    AND a.respondent_id in (...)
    AND a.context IN (1, 0)
LIMIT 50

Solution

  • One solution to your problem is to change your design/approach such that you don't have WHERE IN (...) clauses with 500-1000 items in them. For one thing, it's doubtful that you would ever have some application passing so many parameters back to your database instance. So, assuming this data is not coming from the outside, then it should be possible to maintain it in a separate table. Assuming you had two tables for this, your query could then become:

    SELECT a.definition_id, a.answer AS value
    FROM answers a
    INNER JOIN definitions d
        ON d.id = a.definition_id
    INNER JOIN respondents r
        ON r.id = a.respondent_id
    WHERE
        context IN (1, 0)
    -- ORDER BY <something>
    LIMIT 50;