Search code examples
sqloptimizationdatabase-performance

Combine multiple SQL queries


Currently, to get the data I need, I need to execute multiple SQL statements:

SELECT pubkey.pubkey_id 
FROM pubkey 
WHERE pubkey.pubkey_hash = (input data [1])

SELECT txin_corr.tx_id 
FROM txin_corr 
WHERE txin_corr.pubkey_id = (pubkey.pubkey_id from previous query [max. 1])

SELECT txin_corr.pubkey_id 
FROM txin_corr 
WHERE txin_corr.tx_id = (txin_corr.tx_id from prev.qry. [n])

SELECT pubkey.pubkey_hash
FROM pubkey 
WHERE pubkey.pubkey_id = (txin_corr.pubkey_id from prev.qry. [n])

The first query is no problem because I only have to do it once. But I'm wondering if there is a way to combine (at least) the last three queries. As the db is pretty big (~ 20 GB), I think a "good query" may speed things up considerably.

What I'm doing is: For a given pubkey_id/pubkey_hash, get all tx_ids from txin_corr that contain this pubkey_id in the same row. Then, get all pubkey_ids from txin_corr where the row contains the retrieved tx_ids. Finally, get all the pubkey_hashs for the now retrieved pubkey_ids.


Solution

  • The earlier answer is correct: the key is to join the tables together multiple times. But, there are one-to many relationships in there, so there will need to be left outer joins, not just the inner joins.

    SELECT pk2.pubkey_hash
    FROM   pubkey pk
    INNER JOIN txin_corr tc ON pk.pubkey_id = tc.pubkey_id
    LEFT OUTER JOIN txin_corr tc2 ON tc.tx_id = tc2.tx_id
    LEFT OUTER JOIN pubkey pk2 ON tc2.pubkey_id = pk2.pubkey_id
    WHERE pk.pubkey_hash = (input data)