Search code examples
kettlepentaho-spoonpentaho-data-integration

Pentaho PDI get SQL SUM() with conditions


I'm using Pentaho PDI 7.1. I'm trying to convert data from Mysql to Mysql changing the structure of data.

I'm reading the source table (customers) and for each row I've to run another query to calculate the balance. I was trying to use Database value lookup to accomplish it but maybe is not the best way.

I've to run a query like this to get the balance:

SELECT 

 SUM(
        CASE WHEN direzione='ENTRATA' THEN -importo ELSE +importo  END
 )
   FROM Movimento WHERE contoFidelizzato_id = ?

I should set the parameter taking it from the previous step. Some advice?


Solution

  • The Database lookup value may be a good idea, especially if you are used to database reasoning, but it may result in many queries which may not be the most efficient.

    A more PDI-ish style would be to make the query like:

    SELECT contoFidelizzato_id
         , SUM(CASE WHEN direzione='ENTRATA' THEN -importo ELSE +importo END)
    FROM Movimento
    GROUP BY contoFidelizzato_id
    

    and use it as the info source of a Lookup Stream Step, like this:

    enter image description here

    An even more PDI-ish style would be to divert the source table (customer) in two flows : one in which you keep the source rows, and one that you group by contoFidelizzato_id. Of course, you need a formula, or a Javascript, or to put a formula in the SQL of the Table input to change the sign when needed.

    enter image description here

    Test to know which strategy is better in your case. You'll soon discover that the PDI is very good at handling large data.