Search code examples
sqlpostgresqlhibernateinner-join

Database Select cost too high


I want to use a double INNER SELECT on my database. If I use it via HIBERNATE and entities in my java application, the debugger gets an GC error.

  SELECT * FROM campaign WHERE id_campaign  IN 
    (SELECT id_campaign FROM event WHERE id_event IN 
     (SELECT id_event FROM dataset_event WHERE id_dataset=xxx)) limit 100;

Is there a better way with no GC error? Maybe a different select? Does Hibernate fills the event-entities of the second inner select? It is not necessary i guess..

FYI:

enter image description here

Error in console:

26-Feb-2020 09:36:36.575 SCHWERWIEGEND [ContainerBackgroundProcessor[StandardEngine[Catalina]]] org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run Unexpected death of background thread [ContainerBackgroundProcessor[StandardEngine[Catalina]]]
    java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "ContainerBackgroundProcessor[StandardEngine[Catalina]]" java.lang.OutOfMemoryError: GC overhead limit exceeded

Hibernate version is 5.3.3.final


Solution

  • This probably does what you want:

    SELECT c.*
    FROM campaign c JOIN
         event e
         ON c.id_campaign = e.id_campaign JOIN
         dataset_event de
         ON de.id_event = e.id_event
    WHERE de.id_dataset = xxx
    LIMIT 100;
    

    For performance, you want indexes on:

    • dataset_event(id_dataset, id_event)
    • event(id_event, id_campaign)
    • campaign(id_campaign)

    You may already have some or all of these.

    The above might return duplicates. Obviously, you could use select distinct or distinct on, but that could be quite expensive. Instead, exists might be a better solution:

    SELECT c.*
    FROM campaign c 
    WHERE EXISTS (SELECT 1
                  FROM event e JOIN
                       dataset_event de
                       ON de.id_event = e.id_event
                  WHERE c.id_campaign = e.id_campaign AND
                        de.id_dataset = xxx
                 )
    LIMIT 100;
    

    For this, you want indexes on:

    • event(id_campaign, id_event)
    • dataset_event(id_event, id_dataset)