Search code examples
mysqlsubqueryuniondatabase-performance

How to avoid running an expensive sub-query twice in a union


I want to union two queries. Both queries use an inner join into a data set, that is very intensive to compute, but the dataset query is the same for both queries. For example:

SELECT veggie_id
FROM potatoes
INNER JOIN ( [...] ) massive_market
    ON massive_market.potato_id=potatoes.potato_id
UNION
SELECT veggie_id
FROM carrots
INNER JOIN ( [...] ) massive_market
    ON massive_market.carrot_id=carrots.carrot_id

Where [...] corresponds to a subquery that takes a second to compute, and returns rows of at least carrot_id and potato_id.

I want to avoid having the query for massive_market [...] twice in my overal query.

Whats the best way to do this?


Solution

  • The goal is to pull all repeated query-strings out of the list of query-strings requiring the repeated query-strings. So I kept potatoes and carrots within one unionizing subquery, and placed massive_market afterwards and outside this unification.

    This seems obrvious, but my question originated from a much more complex query, and the work needed to pull this strategy off was a bit more involving in my case. For my simple example in my question above, this would resolve in something like:

    SELECT veggie_id 
    FROM (
      SELECT veggie_id, potato_id, NULL AS carrot_id FROM potatoes
      UNION
      SELECT veggie_id, NULL AS potato_id, carrot_id FROM carrots
    ) unionized
    INNER JOIN ( [...] ) massive_market
      ON massive_market.potato_id=unionized.potato_id 
        OR massive_market.carrot_id=unionized.carrot_id