Search code examples
mainframejcldfsort

Counting records in a group using DFSORT


I have file that contains JES2 job output. I'm practicing with DFSORT to retrieve some information from the file on a per-job basis.

One thing I want to achieve is the following:

  • first line of job output
  • last line of job output
  • total lines of job output (per job)

I managed to get the total lines (of the complete output), job number and total lines of the job in my result:

(1)    (2)    (3)
000001 000001 000001         (1) Total lines
000002 000001 000002         (2) Job number
000003 000001 000003         (3) Lines in job
000004 000001 000004
000005 000001 000005
000006 000001 000006
000007 000001 000007
000008 000001 000008
...
000953 000001 000953
000954 000001 000954
000955 000001 000955
000956 000001 000956
000957 000001 000957
000958 000001 000958
000959 000001 000959
000960 000002 000001   <-- new job output starts here
000961 000002 000002
000962 000002 000003
000963 000002 000004
000964 000002 000005
000965 000002 000006

To achieve the above output, I have used the following SYSIN DD statements (PGM=SORT): row1: INREC OVERLAY=(135:SEQNUM,6,ZD,START=1,INCR=1)

INREC  IFTHEN=(WHEN=GROUP,                         
                    BEGIN=(21,6,CH,EQ,C'J E S 2 '),
                    PUSH=(142:ID=6),                     <--- Second row
                    HIT=NEXT),                     
       IFTHEN=(WHEN=GROUP,                         
                    BEGIN=(21,6,CH,EQ,C'J E S 2 '),
                    PUSH=(149:SEQ=6))                    <--- 3th row

The result I want to achieve looks like this:

JOBNAME  f_row  l_row  rowcnt
JOB12345 000001 000100 000099
JOB54321 000101 000500 000399

The issue I currently have is that I don't know how to calculate the rowcnt column. I'm able to calculate the rowcnt column, but I'm only able to do this on the first row of the next job (via SUB). I think that the best way to do this would be to use an IFTHEN and push the rowcnt to all records in the group, but I haven't got that working in the last 2 days.

At this point I'm stuck. I'm not sure anymore on which statements I should use to accomplish this. Some forums provide a way using 2 separate files, but this is not preferred in my situation. Any guidance into the right direction would be appreciated.


Solution

  • I was able to resolve my question. I took the following steps.

    First I had to add the sequence numbers, which counted the total records, the unique jobs and the number of records within a unique job. I was able to do this before, but I consolidated the 2 steps I had before into one:

    INREC BUILD=(135:SEQNUM,6,ZD,START=1,INCR=1)
    OUTREC IFTHEN=(WHEN=GROUP,                          
                   BEGIN=(20,7,CH,EQ,C'J E S 2'),       
                   PUSH=(142:ID=6),                     
                   HIT=NEXT),                           
           IFTHEN=(WHEN=GROUP,                          
                   BEGIN=(20,7,CH,EQ,C'J E S 2'),       
                   PUSH=(149:SEQ=6))                    
    

    This produces the following output:

     000952 000001 000952
     000953 000001 000953
     000954 000001 000954
     000955 000001 000955
     000956 000001 000956
     000957 000001 000957
     000958 000001 000958
     000959 000001 000959
     000960 000002 000001    <-- New job starts here
     000961 000002 000002
     000962 000002 000003
     000963 000002 000004
     000964 000002 000005
     000965 000002 000006
     000966 000002 000007
     000967 000002 000008
     000968 000002 000009
     000969 000002 000010
     000970 000002 000011
     000971 000002 000012
     000972 000002 000013
    

    My next job was to get the highest record count in a job to the top. I was able to accomplish this by sorting in descending order for the second sort input, see below. I also added a new row (on position 157) to push the highest record count so that I was able to use it later.

    SORT FIELDS=(142,6,ZD,A,135,6,ZD,D)        
    OUTREC IFTHEN=(WHEN=GROUP,                 
                   END=(20,7,CH,EQ,C'J E S 2'),
                   PUSH=(157:149,6))           
    

    The above step provided the following output (sorted on the second column, then first column)

    000959 000001 000959  000959
    000958 000001 000958  000959
    000957 000001 000957  000959
    000956 000001 000956  000959
    000955 000001 000955  000959
    ...
    000007 000001 000007  000959
    000006 000001 000006  000959
    000005 000001 000005  000959
    000004 000001 000004  000959
    000003 000001 000003  000959
    000002 000001 000002  000959
    000001 000001 000001  000959
    001976 000002 001017  001017    <--- Next job starts, highest row count is 1017
    001975 000002 001016  001017
    001974 000002 001015  001017
    001973 000002 001014  001017
    001972 000002 001013  001017
    001971 000002 001012  001017
    001970 000002 001011  001017
    001969 000002 001010  001017
    001968 000002 001009  001017
    001967 000002 001008  001017
    001966 000002 001007  001017
    

    In the forth column I now have the total number of rows in a job. At that was left was to revert back to the original order of records. This was done by sorting on the same columns, but using ascending order for the second sort input instead of descending.

    SORT FIELDS=(142,6,ZD,A,135,6,ZD,A)
    

    When looking back at this solution, I was thinking way to difficult which resulted me in not being able to figure this out :)