Search code examples
loopsgroup-byabapcollect

ABAP - Group employees by cost center and calculate sum


I have an internal table with employees. Each employee is assigned to a cost center. In another column is the salary. I want to group the employees by cost center and get the total salary per cost center. How can I do it?

At first I have grouped them as follows:

Loop at itab assigning field-symbol(<c>) 
group by <c>-kostl ascending.

Write: / <c>-kostl.

This gives me a list of all cost-centers. In the next step I would like to calculate the sum of the salaries per cost center (the sum for all employees with the same cost-center).

How can I do it? Can I use collect?

Update: I have tried with the follwing coding. But I get the error "The syntax for a method specification is "objref->method" or "class=>method"". lv_sum_salary = sum( <p>-salary ).

loop at i_download ASSIGNING FIELD-SYMBOL(<c>)
    GROUP BY <c>-kostl ascending.
    Write: / <c>-kostl, <c>-salary.

    data: lv_sum_salary type p DECIMALS 2.
    Loop at group <c> ASSIGNING FIELD-SYMBOL(<p>).
    lv_sum_salary = sum( <p>-salary ).
     Write: /'  ',<p>-pernr,  <p>-salary.
    endloop.
     Write: /'  ', lv_sum_salary.
  endloop.

Solution

  • Generally speaking, to group and sum, there are these 4 possibilities (code snippets provided below):

    1. SQL with an internal table as source: SELECT ... SUM( ... ) ... FROM @itab ... GROUP BY ... (since ABAP 7.52, HANA database only); NB: beware the possible performance overhead.
    2. The classic way, everything coded:
      • Sort by cost center
      • Loop at the lines
        • At each line, add the salary to the total
        • If the cost center is different in the next line, process the total
    3. LOOP AT with GROUP BY, and LOOP AT GROUP
    4. VALUE with FOR GROUPS and GROUP BY, and REDUCE and FOR ... IN GROUP for the sum

    Note that only the option with the explicit sorting will sort by cost center, the other ones won't provide a result sorted by cost center.

    All the below examples have in common these declarative and initialization parts:

    TYPES: BEGIN OF ty_itab_line,
             kostl  TYPE c LENGTH 10,
             pernr  TYPE c LENGTH 10,
             salary TYPE p LENGTH 8 DECIMALS 2,
           END OF ty_itab_line,
           tt_itab TYPE STANDARD TABLE OF ty_itab_line WITH EMPTY KEY,
           BEGIN OF ty_total_salaries_by_kostl,
             kostl          TYPE c LENGTH 10,
             total_salaries TYPE p LENGTH 10 DECIMALS 2,
           END OF ty_total_salaries_by_kostl,
           tt_total_salaries_by_kostl TYPE STANDARD TABLE OF ty_total_salaries_by_kostl WITH EMPTY KEY.
    DATA(itab) = VALUE tt_itab( ( kostl = 'CC1' pernr = 'E1' salary = '4000.00' )
                                ( kostl = 'CC1' pernr = 'E2' salary = '3100.00' )
                                ( kostl = 'CC2' pernr = 'E3' salary = '2500.00' ) ).
    DATA(total_salaries_by_kostl) = VALUE tt_total_salaries_by_kostl( ).
    

    and the expected result will be:

    ASSERT total_salaries_by_kostl = VALUE tt_total_salaries_by_kostl(
        ( kostl = 'CC1' total_salaries = '7100.00' )
        ( kostl = 'CC2' total_salaries = '2500.00' ) ).
    

    Examples for each possibility:

    1. SQL on internal table:
      SELECT kostl, SUM( salary ) AS total_salaries
        FROM @itab AS itab ##DB_FEATURE_MODE[ITABS_IN_FROM_CLAUSE]
        GROUP BY kostl
        INTO TABLE @total_salaries_by_kostl.
      
    2. Classic way:
      SORT itab BY kostl.
      DATA(next_line) = VALUE ty_ref_itab_line( ).
      DATA(total_line) = VALUE ty_total_salaries_by_kostl( ).
      LOOP AT itab REFERENCE INTO DATA(line).
        DATA(next_kostl) = VALUE #( itab[ sy-tabix + 1 ]-kostl OPTIONAL ).
        total_line-total_salaries = total_line-total_salaries + line->salary.
        IF next_kostl <> line->kostl.
          total_line-kostl = line->kostl.
          APPEND total_line TO total_salaries_by_kostl.
          CLEAR total_line.
        ENDIF.
      ENDLOOP.
      
      EDIT: I don't talk about AT NEW and AT END OF because I'm not fan of them, as they don't explicitly define the possible multiple fields, they implicitly consider all the fields before the mentioned field + this field included. I also ignore ON CHANGE OF, this one being obsolete.
    3. LOOP AT with GROUP BY:
      LOOP AT itab REFERENCE INTO DATA(line)
          GROUP BY ( kostl = line->kostl )
          REFERENCE INTO DATA(kostl_group).
        DATA(total_line) = VALUE ty_total_salaries_by_kostl(
            kostl = kostl_group->kostl ).
        LOOP AT GROUP kostl_group REFERENCE INTO line.
          total_line-total_salaries = total_line-total_salaries + line->salary.
        ENDLOOP.
        APPEND total_line TO total_salaries_by_kostl.
      ENDLOOP.
      
    4. VALUE with FOR and GROUP BY, and REDUCE for the sum:
      total_salaries_by_kostl = VALUE #(
          FOR GROUPS <kostl_group> OF <line> IN itab
          GROUP BY ( kostl = <line>-kostl )
          ( kostl          = <kostl_group>-kostl
            total_salaries = REDUCE #( INIT sum = 0
                                       FOR <line_2> IN GROUP <kostl_group>
                                       NEXT sum = sum + <line_2>-salary ) ) ).