Search code examples
pythonmysqlselectmysql-connector-python

MySQL perform multiple, similar select queries with one statement


I am currently using MySQL Connector/Python to query a database.

Currently, one of the processes I am doing looks something like this:

SELECT some_column FROM some_table WHERE some_column_2 = foo_1;
SELECT some_column FROM some_table WHERE some_column_2 = foo_2;
SELECT some_column FROM some_table WHERE some_column_2 = foo_3;
...
SELECT some_column FROM some_table WHERE some_column_2 = foo_9999;
SELECT some_column FROM some_table WHERE some_column_2 = foo_10000;

I am running a very large number of very similar queries.

The individual results of each query are important. Even if a given query yields nothing as its result.

Unfortunately, the current approach of making several thousand queries like so is time consuming and extremely inefficient.

So, my question is: is there any way to condense this procedure into a single query, so that it runs more efficiently?

I feel like this should definitely be possible in MySQL, and I'm fairly sure there's something I'm missing here.


So far, I have tried using MySQL Connector/Python's executemany() function, like so:

cursor.executemany("SELECT some_column FROM some_table WHERE some_column_2 = %s", foo_list)

I expected this to yield a list of the results to each individual query, but optimized as many queries. Unfortunately, this does not appear to work in MySQL Connector/Python, and results in a syntax error.

I am guessing that executemany is not meant to be used with SELECT statements, only with INSERT statements.


I have also searched the internet for ways to perform bulk select queries, but I have yet to find anything relevant to my situation.

I would like to explicitly point out that

SELECT some_column FROM some_table WHERE some_column_2 IN (foo_1, foo_2, foo_3, etc);

is not a valid solution, because it does not preserve the results (or lack thereof) from each individual query.


Any assistance with this would be welcome.


Solution

  • Rather than

    SELECT some_column FROM some_table WHERE some_column_2 IN (foo_1, foo_2, ..., foo_N);

    simply ask for

    SELECT some_column, some_column_2 FROM some_table WHERE some_column_2 IN (foo_1, foo_2, ..., foo_N);
    

    The code generating the query knows about the foo 1..N values, and will see a subset of them coming back in the column 2 result.

    As a separate matter, using IN often is an anti-pattern, which interferes with the mysql optimizer's ability to create a good plan. For a large IN clause, you may find you're better off going to the trouble of populating a FOO table, and joining against that.