I am tasked to migrate a Perl program to java.
And I am having some trouble, there is a function in Perl to calculate the checksum
sub ComputeChecksum {
my ($val) = @_;
$value= 0;
$multiplier = 1;
$dlength=length($val);
for($i=0; $i<$dlength; $i++) {
$ival=ord(substr($val, $i, 1));
$ival*=$multiplier;
$value+=$ival/100000;
$value+=$ival%100000;
$multiplier%=2;
$multiplier++;
}
$value%=100000;
$value=(100000-$value)%100000;
return ($value);
}
I translated this code into java as shown bellow
private static long computeCheckSum(String id){
long value = 0;
long multiplier = 1;
int dlength = id.length();
for(int i=0; i< dlength; i++){
int ival = id.charAt(i);
ival *= multiplier;
value += ival/100000;
value += ival%100000;
multiplier %= 2;
multiplier++;
}
value %= 100000;
value = ((100000-value)%100000);
return value;
}
But I am running into an issue when validating old data that is in the database, around 5% of the time the calculated checksum in java does not match the one calculated by Perl previously, and it missed by 1.
Here are some examples of the old data and the results in java
Perl Java
ff08ccfba8ad417db2857fc8933788af:96410 / 96409
ff163b2b2e3ef18265d08cc8965b864d:96533 / 96532
ff3848ff301b534b609148af93b2c9ce:96626 / 96625
ff48ec78ea190f44233c9050fd73137d:96631 / 96630
ff62234601e28e6f5d7ff89d95afbec0:96424 / 96423
ff78f4cabe5a565a1cdf11f752f13654:96495 / 96494
ff89dc6b86a535ef265727af90a9b6b3:96596 / 96595
ff98337de5eb9e60f1db51dbd14a4dd2:96366 / 96365
fff76022c6f9f7794793141a9f2a00d2:96813 / 96812
In these sets the first part is the data in the database, which is composed of a uid, an semicolon and the calculated checksum by Perl this is how the data is saved in the database.
For comparison i added the slash and the result calculated by my java code using the uid part.
Can anyone point me in the right direction as to why this could be happening? I need to be able to calculate the correct checksum 100% of the time.
======================================================================
Update :
I changed my java code, to do all calculations in float, instead of int, and it came out more interesting, the difference, was in the same group of uids, but instead of generating a difference of 1, it was a difference of 2.
I am thinking of proposing a recalculation of the checksum of all the UID, and updating all the tables that are affected.
Printing $ival/100000
for ff08ccfba8ad417db2857fc8933788af
gives the following:
0.00102
0.00204
0.00048
0.00112
...
I don't know if the division is an integer division or not in the Java program, but it amounts to adding zero either way. This is a difference.
By the way, I suspect the Perl program is the buggy one. I suspect integer division was desired (int($ival/100000)
), but floating point division was accidentally used.
That said, for the numbers you provided, the difference doesn't seem significant enough to have a result i.e. using int($ival/100000)
produces the same result as $ival/100000
. There is apparently another difference.
...or is there? Adding
while (<DATA>) {
chomp;
say $_, ":", ComputeChecksum($_);
}
__DATA__
ff08ccfba8ad417db2857fc8933788af
ff163b2b2e3ef18265d08cc8965b864d
ff3848ff301b534b609148af93b2c9ce
ff48ec78ea190f44233c9050fd73137d
ff62234601e28e6f5d7ff89d95afbec0
ff78f4cabe5a565a1cdf11f752f13654
ff89dc6b86a535ef265727af90a9b6b3
ff98337de5eb9e60f1db51dbd14a4dd2
fff76022c6f9f7794793141a9f2a00d2
I get what you say are the Java results (even with $ival/100000
):
ff08ccfba8ad417db2857fc8933788af:96409
ff163b2b2e3ef18265d08cc8965b864d:96532
ff3848ff301b534b609148af93b2c9ce:96625
ff48ec78ea190f44233c9050fd73137d:96630
ff62234601e28e6f5d7ff89d95afbec0:96423
ff78f4cabe5a565a1cdf11f752f13654:96494
ff89dc6b86a535ef265727af90a9b6b3:96595
ff98337de5eb9e60f1db51dbd14a4dd2:96365
fff76022c6f9f7794793141a9f2a00d2:96812