Search code examples
mysqlhavingnested-statement

how to get records of a table whose id's are taken from a select statement


I have two tables. I try to select some records from one of them. Then, ID's part of that selection should be used to select some records of the other table. For that, I wrote a statement which takes long time for execution. I even couldn't see the results. Also, that broke my phpmyadmin from localhost.

Here is the code:

SELECT * FROM uniquestructures as uns WHERE uns.ProteinID IN (SELECT unp.* FROM uniqueproteins as unp HAVING LENGTH(unp.PDBASequence) < 20) as T)

To make it clear, first it selects records with all columns which have sequence length less than 20. Later, according to IDs of the selected records, I am searching the records have the same ID (as ProteinID)

Thank you very much for your help


Solution

  • I reckon you need to use an INNER JOIN with a DISTINCT here:

    SELECT distinct uns.*
    FROM uniquestructures as uns 
    INNER JOIN uniqueproteins as unp on uns.ProteinID = unp.ProteinId
    where LENGTH(unp.PDBASequence) < 20;
    

    Also, you might have some joy if you create a separate column on the uniqueproteins table to hold the length of the PDBASequence column (say PDBASequenceLength). You could then put an index on the PDBASequenceLength column rather than calling LENGTH(PDBASequence) in your query. If the data is not static then create a trigger to populate the PDBASequenceLength column each time a row is inserted or updated into the uniqueproteins table. Hence:

    CREATE TRIGGER uniqueproteins_length_insert_trg
    AFTER INSERT ON uniqueproteins FOR EACH ROW SET NEW.PDBASequenceLength = length(new.PDBASequence);
    
    CREATE TRIGGER uniqueproteins_length_update_trg
    AFTER UPDATE ON uniqueproteins FOR EACH ROW SET NEW.PDBASequenceLength = length(new.PDBASequence);
    
    alter table uniqueproteins add key `uniqueproteinsIdx2` (PDBASequenceLength);
    

    Your query could then be:

    SELECT uns.*
    FROM uniquestructures as uns 
    INNER JOIN uniqueproteins as unp on uns.ProteinID = unp.ProteinId
    where unp.PDBASequenceLength < 20;
    

    Good luck!