I am working with some old data imports and came across a bunch of data from an external source that reports financial numbers with a signed overpunch. I've seen alot, but this is before my time. Before I go about creating a function to parse these strangers, I wanted to check to see if there was a standard way to handle these.
I guess my question is, does the .Net framework provide a standard facility for converting signed overpunch strings? If not .NET, are there any third party tools I can use so I don't reinvent the wheel?
Over-punched numeric (Zoned-Decimal in Cobol) comes from the old-punched cards where they over-punched the sign on the last digit in a number. The format is commonly used in Cobol.
As there are both Ascii and Ebcdic Cobol compilers, there are both Ascii and EBCDIC versions of the Zoned-Numeric. To make it even more complicated, the -0 and +0 values ({} for US-Ebcdic (IBM037) are different for say German-Ebcdic (IBM273 where they are äü) and different again in other Ebcdic language versions).
To process successfully, You need to know:
If the data is in the original character set, you can calculate the sign by
For EBCDIC the numeric hex codes are:
Digit 0 1 2 .. 9
unsigned: x'F0' x'F1' x'F2' .. x'F9' 012 .. 9
Negative: x'D0' x'D1' x'D2' .. x'D9' }JK .. R
Positive: x'C0' x'C1' x'C2' .. x'C9' {AB .. I
For US-Ebcdic Zoned this is the java code to convert a string:
int positiveDiff = 'A' - '1';
int negativeDiff = 'J' - '1';
lastChar = ret.substring(ret.length() - 1).toUpperCase().charAt(0);
switch (lastChar) {
case '}' : sign = "-";
case '{' :
lastChar = '0';
break;
case 'A':
case 'B':
case 'C':
case 'D':
case 'E':
case 'F':
case 'G':
case 'H':
case 'I':
lastChar = (char) (lastChar - positiveDiff);
break;
case 'J':
case 'K':
case 'L':
case 'M':
case 'N':
case 'O':
case 'P':
case 'Q':
case 'R':
sign = "-";
lastChar = (char) (lastChar - negativeDiff);
default:
}
ret = sign + ret.substring(0, ret.length() - 1) + lastChar;
For German-EBCDIC {} become äü, for other EBCDIC-Language you would need lookup the appropriate coded page.
For Ascii Zoned this is the java code
int positiveFjDiff = '@' - '0';
int negativeFjDiff = 'P' - '0';
lastChar = ret.substring(ret.length() - 1).toUpperCase().charAt(0);
switch (lastChar) {
case '@':
case 'A':
case 'B':
case 'C':
case 'D':
case 'E':
case 'F':
case 'G':
case 'H':
case 'I':
lastChar = (char) (lastChar - positiveFjDiff);
break;
case 'P':
case 'Q':
case 'R':
case 'S':
case 'T':
case 'U':
case 'V':
case 'W':
case 'X':
case 'Y':
sign = "-";
lastChar = (char) (lastChar - negativeFjDiff);
default:
}
ret = sign + ret.substring(0, ret.length() - 1) + lastChar;
Finally if you are working in EBCDIC you can calculate it like
sign = '+'
if (last_digit & x'F0' == x'D0') {
sign = '-'
}
last_digit = last_digit | x'F0'
One last problem is decimal points are not stored in a Zoned, decimal they are assumed. You need to look at the Cobol-Copybook.
e.g.
if the cobol Copybook is
03 fld pic s99999.
123 is stored as 0012C (EBCDIC source)
but if the copybook is (v stands for assumed decimal point)
03 fld pic s999v99.
then 123 is stored as 1230{
It would be best to do the translated in Cobol !!! or using a Cobol Translation packages.
There are several Commercial Packages for handling Cobol Data, they tend to be expensive. There are some Java are some open source packages that can deal with Mainframe Cobol Data.