Multiplication division using DFSORT utility in Mainframe

There are two files FILE1.DATA and FILE2.DATA To calculate percentage (Number of records in FILE1/Number of records in FILE2)*100 using DFSORT in Mainframe. And setting Return Code if it crossing a threshold (90%).

//********Extracting Unique records data*****************
//SORTT000 EXEC PGM=SORT   
//SYSOUT   DD  SYSOUT=*    
//SORTIN   DD  DSN=SAMPLE.DATA1,DISP=SHR 
//SORTOUT  DD  DSN=FILE1.DATA,                          
//             SPACE=(2790,(5376,1075),RLSE),                    
//             UNIT=TSTSF,                                       
//             DCB=(RECFM=FB,LRECL=05,BLKSIZE=0),                
//             DISP=(NEW,CATLG,DELETE)                           
//SYSIN    DD  *                                                 
 SORT FIELDS=(10,5,CH,A) 
 OUTREC FIELDS=(1:10,5)  
 SUM FIELDS=NONE         
/*                  
//************Getting count of records*****************     
//STEP001  EXEC PGM=ICETOOL                               
//TOOLMSG  DD SYSOUT=*         
//DFSMSG   DD SYSOUT=*                                    
//SYSOUT DD SYSOUT=*                                      
//SYSPRINT DD SYSOUT=*                                    
//IN1      DD DISP=SHR,DSN=FILE1.DATA
//IN2      DD DISP=SHR,DSN=FILE2.DATA
//OUT1     DD DSN=FILE1.DATA.COUNT,    
//            SPACE=(2790,(5376,1075),RLSE),      
//            UNIT=TSTSF,                         
//            DCB=(RECFM=FB,LRECL=06,BLKSIZE=0),  
//            DISP=(NEW,CATLG,DELETE)     
//OUT2     DD DSN=FILE2.DATA.COUNT,  
//            SPACE=(2790,(5376,1075),RLSE),      
//            UNIT=TSTSF,                         
//            DCB=(RECFM=FB,LRECL=06,BLKSIZE=0),  
//            DISP=(NEW,CATLG,DELETE)    
//TOOLIN   DD *                                           
   COUNT FROM(IN1)  WRITE(OUT1) DIGITS(6)                   
   COUNT FROM(IN2)  WRITE(OUT2) DIGITS(6)                    
/*      
//*******Calculating percentage and if above 90% setting RC 04*****                                                  
//STEP002  EXEC PGM=SORT                                             
//SYSOUT   DD SYSOUT=*                                               
//SORTIN   DD DSN=FILE2.DATA.COUNT,DISP=SHR                
//         DD DSN=FILE1.DATA.COUNT,DISP=SHR                
//SORTOUT  DD DSN=FILE.DATA.COUNT.OUT,     
//            SPACE=(2790,(5376,1075),RLSE),      
//            UNIT=TSTSF,                         
//            DCB=(RECFM=FB,LRECL=80,BLKSIZE=0),  
//            DISP=(NEW,CATLG,DELETE)                    
//SETRC    DD SYSOUT=*                                               
//SYSIN    DD *                                                      
  INREC IFTHEN=(WHEN=INIT,BUILD=(1,6,X,6X'00',SEQNUM,1,ZD,80:X)),    
  IFTHEN=(WHEN=(14,1,ZD,EQ,2),OVERLAY=(8:1,6))                       
  SORT FIELDS=(7,1,CH,A),EQUALS                                      
  SUM FIELDS=(8,4,BI,12,2,BI)                                        
  OUTREC OVERLAY=(15:X,1,6,ZD,DIV,+2,M11,LENGTH=6,X,                 
                 (8,6,ZD,MUL,+100),DIV,1,6,ZD,MUL,+100,EDIT=(TTT.TT))

  OUTFIL FNAMES=SETRC,NULLOFL=RC4,INCLUDE=(23,6,CH,GT,C'090.00')          
  OUTFIL BUILD=(05:C'TOTAL NUMBER RECRODS IN FILE2        : ',1,6,/, 
                05:C'TOTAL NUMBER RECRODS IN FILE1        : ',8,6,/, 
                05:C'PERCENTAGE                           : ',23,6,/,
               80:X)                                                
//*

The problem I am facing is datasets FILE1.DATA.COUNT and FILE1.DATA.COUNT are getting created of 15 record length despite mentioning LRECL 6. (note, this was the question that existed when the first answer was written and does not relate now to the above code).
Can we merge both steps into one?
What does this, (15:X,1,6,ZD,DIV,+2,M11,LENGTH=6,X, (8,6,ZD,MUL,+100),DIV,1,6,ZD,MUL,+100,EDIT=(TTT.TT)), mean specifically?

Solution

The answer to your first question is simply that you did not tell ICETOOL's COUNT operator how long you wanted the output data to be, so it came up with its own figure.

This is from the DFSORT Application Programming Guide:

WRITE(countdd) Specifies the ddname of the count data set to be produced by ICETOOL for this operation. A countdd DD statement must be present. ICETOOL sets the attributes of the count data set as follows:

v RECFM is set to FB.

v LRECL is set to one of the following:

– If WIDTH(n) is specified, LRECL is set to n. Use WIDTH(n) if your count record length and LRECL must be set to a particular value (for example, 80), or if you want to ensure that the count record length does not exceed a specific maximum (for example, 20 bytes).

– If WIDTH(n) is not specified, LRECL is set to the calculated required record length. If your LRECL does not need to be set to a particular value, you can let ICETOOL determine and set the appropriate LRECL value by not specifying WIDTH(n).

And:

DIGITS(d)

Specifies d digits for the count in the output record, overriding the default of 15 digits. d can be 1 to 15. The count is written as d decimal digits with leading zeros. DIGITS can only be specified if WRITE(countdd) is specified.

If you know that your count requires less than 15 digits, you can use a lower number of digits (d) instead by specifying DIGITS(d). For example, if DIGITS(10) is specified, 10 digits are used instead of 15.

If you use DIGITS(d) and the count overflows the number of digits used, ICETOOL terminates the operation. You can prevent the overflow by specifying an appropriately higher d value for DIGITS(d). For example, if DIGITS(5) results in overflow, you can use DIGITS(6) instead.

And:

WIDTH(n)

Specifies the record length and LRECL you want ICETOOL to use for the count data set. n can be from 1 to 32760. WIDTH can only be specified if WRITE(countdd) is specified. ICETOOL always calculates the record length required to write the count record and uses it as follows:

v If WIDTH(n) is specified and the calculated record length is less than or equal to n, ICETOOL sets the record length and LRECL to n. ICETOOL pads the count record on the right with blanks to the record length.

v If WIDTH(n) is specified and the calculated record length is greater than n, ICETOOL issues an error message and terminates the operation.

v If WIDTH(n) is not specified, ICETOOL sets the record length and LRECL to the calculated record length.

Use WIDTH(n) if your count record length and LRECL must be set to a particular value (for example, 80), or if you want to ensure that the count record length does not exceed a specific maximum (for example, 20 bytes). Otherwise, you can let ICETOOL calculate and set the appropriate record length and LRECL by not specifying WIDTH(n).

For your second question, yes it can be done in one step, and greatly simplified.

The thing is, it can be further simplified by doing something else. Exactly what else depends on your actual task, which we don't know, we only know of the solution you have chosen for your task.

For instance, you want to know when one file is within 10% of the size of the other. One way, if on-the-dot accuracy is not required, is to talk to the technical staff who manage your storage. Tell them what you want to do, and they probably already have something you can use to do it with (when discussing this, bear in mind that these are technically data sets, not files).

Alternatively, something has already previously read or written those files. If the last program to do so does not already produce counts of what it has read/written (to my mind, standard good practice, with the program reconciling as well) then amend the programs to do so now. There. Magic. You have your counts.

Arrange for those counts to be in a data set of their own (preferably with record-types, headers/trailers, more standard good practice).

One step to take the larger (expectation) of the two counts, "work out" what 00% would be (doesn't need anything but a simple subtraction, with the right data) and generate a SYMNAMES format file (fixed-length 80-byte records) with a SORT-symbol for a constant with that value.

Second step which uses INCLUDE/OMIT with the symbol in comparison to the second record-count, using NULLOUT or NULLOFL.

The advantage of the above types of solution is that they basically use very few resources. On the Mainframe, the client pays for resources. Your client may not be so happy at the end of the year to find that they've paid for reading and "counting" 7.3m records just so that you can set an RC.

OK, perhaps 7.3m is not so large, but, when you have your "solution", the next person along is going to do it with 100,000 records, the next with 1,000,000 records. All to set an RC. Any one run of which (even with the 10,000-record example) will outweigh the costs of a "Mainframe" solution running every day for the next 15+ years.

For your third question:

 OUTREC OVERLAY=(15:X,1,6,ZD,DIV,+2,M11,LENGTH=6,X,
                (8,6,ZD,MUL,+100),DIV,1,6,ZD,MUL,+100,EDIT=(TTT.TT))

OUTREC is processed after SORT/MERGE and SUM (if present) otherwise after INREC. Note, the physical order in which these are specified in the JCL does not affect the order they are processed in.

OVERLAY says "update the information in the current record with these data-manipulations (BUILD always creates a new copy of the current record).

15: is "column 15" (position 15) on the record.

X inserts a blank.

1,6,ZD means "the information, at this moment, at start-position one for a length of six, which is a zoned-decimal format".

DIV is divde.

+2 is a numeric constant.

1,6,ZD,DIV,+2 means "take the six-digit number starting at position one, and divide it by two, giving a 'result', which will be placed at the next available position (16 in your case).

M11 is a built-in edit-mask. For details of what that mask is, look it up in the manual, as you will discover other useful pre-defined masks at the time. Use that to format the result.

LENGTH=6 limits the result to six digits.

So far, the number in the first six positions will be divided by two, treated (by the mask) as an unsigned zoned-decimal of six digits, starting from position 16.

The remaining elements of the statement are similar. Brackets affect the "precedence" of numeric operators in a normal way (consult the manual to be familiar with the precedence rules).

EDIT=(TTT.TT) is a used-defined edit mask, in this case inserting a decimal point, truncating the otherwise existing left-most digit, and having significant leading zeros when necessary.