Search code examples
c#asp.net.netcobolsigned

Parser for signed overpunch values?


I am working with some old data imports and came across a bunch of data from an external source that reports financial numbers with a signed overpunch. I've seen alot, but this is before my time. Before I go about creating a function to parse these strangers, I wanted to check to see if there was a standard way to handle these.

I guess my question is, does the .Net framework provide a standard facility for converting signed overpunch strings? If not .NET, are there any third party tools I can use so I don't reinvent the wheel?


Solution

  • Over-punched numeric (Zoned-Decimal in Cobol) comes from the old-punched cards where they over-punched the sign on the last digit in a number. The format is commonly used in Cobol.

    Illustration showing overpunch on a punchcard from the IBM 29 Card Punch Reference Manual

    As there are both Ascii and Ebcdic Cobol compilers, there are both Ascii and EBCDIC versions of the Zoned-Numeric. To make it even more complicated, the -0 and +0 values ({} for US-Ebcdic (IBM037) are different for say German-Ebcdic (IBM273 where they are äü) and different again in other Ebcdic language versions).

    To process successfully, You need to know:

    • Did the data originate in a Ebcdic or Ascii system
    • if Ebcdic - which language US, German etc

    If the data is in the original character set, you can calculate the sign by

    For EBCDIC the numeric hex codes are:

    Digit          0     1     2   ..    9
    
    unsigned:   x'F0' x'F1' x'F2'  .. x'F9'     012 .. 9 
    Negative:   x'D0' x'D1' x'D2'  .. x'D9'     }JK .. R
    Positive:   x'C0' x'C1' x'C2'  .. x'C9'     {AB .. I
    

    For US-Ebcdic Zoned this is the java code to convert a string:

    int positiveDiff = 'A' - '1';
    int negativeDiff = 'J' - '1';
    
    lastChar = ret.substring(ret.length() - 1).toUpperCase().charAt(0);
    
        switch (lastChar) {
            case '}' : sign = "-";
            case '{' :
                lastChar = '0';
            break;
            case 'A':
            case 'B':
            case 'C':
            case 'D':
            case 'E':
            case 'F':
            case 'G':
            case 'H':
            case 'I':
                lastChar = (char) (lastChar - positiveDiff);
            break;
            case 'J':
            case 'K':
            case 'L':
            case 'M':
            case 'N':
            case 'O':
            case 'P':
            case 'Q':
            case 'R':
                sign = "-";
                lastChar = (char) (lastChar - negativeDiff);
            default:
        }
        ret = sign + ret.substring(0, ret.length() - 1) + lastChar;
    

    For German-EBCDIC {} become äü, for other EBCDIC-Language you would need lookup the appropriate coded page.

    For Ascii Zoned this is the java code

        int positiveFjDiff = '@' - '0';
        int negativeFjDiff = 'P' - '0';
    
        lastChar = ret.substring(ret.length() - 1).toUpperCase().charAt(0);
    
        switch (lastChar) {
            case '@':
            case 'A':
            case 'B':
            case 'C':
            case 'D':
            case 'E':
            case 'F':
            case 'G':
            case 'H':
            case 'I':
                lastChar = (char) (lastChar - positiveFjDiff);
            break;
            case 'P':
            case 'Q':
            case 'R':
            case 'S':
            case 'T':
            case 'U':
            case 'V':
            case 'W':
            case 'X':
            case 'Y':
                sign = "-";
                lastChar = (char) (lastChar - negativeFjDiff);
            default:
        }
        ret = sign + ret.substring(0, ret.length() - 1) + lastChar;
    

    Finally if you are working in EBCDIC you can calculate it like

    sign = '+'
    if (last_digit & x'F0' == x'D0') {
       sign = '-' 
    } 
    last_digit = last_digit | x'F0'
    

    One last problem is decimal points are not stored in a Zoned, decimal they are assumed. You need to look at the Cobol-Copybook.

    e.g.

     if the cobol Copybook is
    
        03 fld                 pic s99999.
    
     123 is stored as     0012C (EBCDIC source)
    
     but if the copybook is (v stands for assumed decimal point) 
    
       03 fld                  pic s999v99.
    
     then 123 is stored as 1230{  
    

    It would be best to do the translated in Cobol !!! or using a Cobol Translation packages.

    There are several Commercial Packages for handling Cobol Data, they tend to be expensive. There are some Java are some open source packages that can deal with Mainframe Cobol Data.