Search code examples
javaperlchecksum

Perl Checksum Calculation Migration to Java


I am tasked to migrate a Perl program to java.

And I am having some trouble, there is a function in Perl to calculate the checksum

sub ComputeChecksum  {
   my ($val) = @_;

   $value= 0;
   $multiplier = 1;

   $dlength=length($val);
   for($i=0; $i<$dlength; $i++) {
      $ival=ord(substr($val, $i, 1));
      $ival*=$multiplier;
      $value+=$ival/100000;
      $value+=$ival%100000;
      $multiplier%=2;
      $multiplier++;
   }
   $value%=100000;
   $value=(100000-$value)%100000;
   return ($value);
}

I translated this code into java as shown bellow

private static long computeCheckSum(String id){
    long value = 0;
    long multiplier = 1;

    int dlength = id.length();

    for(int i=0; i< dlength; i++){
        int ival = id.charAt(i);
        ival *= multiplier;
        value += ival/100000;
        value += ival%100000;
        multiplier %= 2;
        multiplier++;
    }
    value %= 100000;
    value = ((100000-value)%100000);

    return value;
}

But I am running into an issue when validating old data that is in the database, around 5% of the time the calculated checksum in java does not match the one calculated by Perl previously, and it missed by 1.

Here are some examples of the old data and the results in java

                                 Perl    Java
ff08ccfba8ad417db2857fc8933788af:96410 / 96409
ff163b2b2e3ef18265d08cc8965b864d:96533 / 96532
ff3848ff301b534b609148af93b2c9ce:96626 / 96625
ff48ec78ea190f44233c9050fd73137d:96631 / 96630
ff62234601e28e6f5d7ff89d95afbec0:96424 / 96423
ff78f4cabe5a565a1cdf11f752f13654:96495 / 96494
ff89dc6b86a535ef265727af90a9b6b3:96596 / 96595
ff98337de5eb9e60f1db51dbd14a4dd2:96366 / 96365
fff76022c6f9f7794793141a9f2a00d2:96813 / 96812

In these sets the first part is the data in the database, which is composed of a uid, an semicolon and the calculated checksum by Perl this is how the data is saved in the database.

For comparison i added the slash and the result calculated by my java code using the uid part.

Can anyone point me in the right direction as to why this could be happening? I need to be able to calculate the correct checksum 100% of the time.

======================================================================

Update :

I changed my java code, to do all calculations in float, instead of int, and it came out more interesting, the difference, was in the same group of uids, but instead of generating a difference of 1, it was a difference of 2.

I am thinking of proposing a recalculation of the checksum of all the UID, and updating all the tables that are affected.


Solution

  • Printing $ival/100000 for ff08ccfba8ad417db2857fc8933788af gives the following:

    0.00102
    0.00204
    0.00048
    0.00112
    ...
    

    I don't know if the division is an integer division or not in the Java program, but it amounts to adding zero either way. This is a difference.

    By the way, I suspect the Perl program is the buggy one. I suspect integer division was desired (int($ival/100000)), but floating point division was accidentally used.

    That said, for the numbers you provided, the difference doesn't seem significant enough to have a result i.e. using int($ival/100000) produces the same result as $ival/100000. There is apparently another difference.


    ...or is there? Adding

    while (<DATA>) {
       chomp;
       say $_, ":", ComputeChecksum($_);
    }
    
    __DATA__
    ff08ccfba8ad417db2857fc8933788af
    ff163b2b2e3ef18265d08cc8965b864d
    ff3848ff301b534b609148af93b2c9ce
    ff48ec78ea190f44233c9050fd73137d
    ff62234601e28e6f5d7ff89d95afbec0
    ff78f4cabe5a565a1cdf11f752f13654
    ff89dc6b86a535ef265727af90a9b6b3
    ff98337de5eb9e60f1db51dbd14a4dd2
    fff76022c6f9f7794793141a9f2a00d2
    

    I get what you say are the Java results (even with $ival/100000):

    ff08ccfba8ad417db2857fc8933788af:96409
    ff163b2b2e3ef18265d08cc8965b864d:96532
    ff3848ff301b534b609148af93b2c9ce:96625
    ff48ec78ea190f44233c9050fd73137d:96630
    ff62234601e28e6f5d7ff89d95afbec0:96423
    ff78f4cabe5a565a1cdf11f752f13654:96494
    ff89dc6b86a535ef265727af90a9b6b3:96595
    ff98337de5eb9e60f1db51dbd14a4dd2:96365
    fff76022c6f9f7794793141a9f2a00d2:96812