Search code examples
hana

Multiple left joins with aggregation on same table causes huge performance hit in SAP HANA


I am joining two tables on HANA and, to get some statistics, I am LEFT joining the items table 3 times to get a total count, number of entries processed and number of errors, as shown below.

This is a dev system and the items table has only 1500 items. But the query below runs for 17 seconds.

When I remove any of the three aggregation terms (but leave the corresponding JOIN in place), the query executes almost immediately.

I have also tried adding indexes on the fields used in the specific JOINs, but that makes no difference.

select rk.guid, rk.run_id, rk.status, rk.created_at, rk.created_by, 
count( distinct rp.guid ), 
count( distinct rp2.guid ), 
count( distinct rp3.guid )
    from zbsbpi_rk as rk
    left join zbsbpi_rp as rp
      on rp.header = rk.guid
    left join zbsbpi_rp as rp2
      on rp2.header = rk.guid
     and rp2.processed = 'X'
    left join zbsbpi_rp as rp3
      on rp3.header = rk.guid
     and rp3.result_status = 'E'
    where rk.run_id = '0000000010'
    group by rk.guid, run_id, status, created_at, created_by

Solution

  • My apologies, but I forgot that I had posted this question here. I had posted the same question at answers.sap.com after not getting any joy here: https://answers.sap.com/questions/172096/multiple-left-joins-with-aggregation-on-same-table.html

    I eventually came up with the solution, which was a bit of a "doh!" moment:

      select rk.guid, rk.run_id, rk.status, rk.created_at, rk.created_by,
        count( distinct rp.guid ), 
        count( distinct rp2.guid ), 
        count( distinct rp3.guid )
        from zbsbpi_rk as rk
        join zbsbpi_rp as rp
          on rp.header = rk.guid
        left join zbsbpi_rp as rp2
          on rp2.guid = rp.guid
         and rp2.processed = 'X'
        left join zbsbpi_rp as rp3
          on rp3.guid = rp.guid
         and rp3.result_status = 'E'
        where rk.run_id = '0000000010'
        group by rk.guid, run_id, status, created_at, created_by
    

    The subsequent left joins needed only to be joined to the first join on the same table, as the first join contained a superset of all the records anyway.