Search code examples
phpc++hexbase36

Why the result of converting from hex string to base36 string in C++ differs from result of same operation in PHP?


Here is C++ code:

#include <stdio.h>
#include <openssl/sha.h>
#include <string>
#include <gmp.h>

std::string base36enc(std::string data){
    mpz_t nr;
    mpz_init(nr);
    mpz_set_str(nr, data.c_str(), 16);
    return std::string(mpz_get_str(NULL, 36, nr));
}

//sha512 function here...

int main(){
    std::string data = "deadbeef1234d0";
    printf("Raw data to base36:\t%s\n",base36enc(data).c_str());
    printf("SHA512 data to b36:\t%s\n",base36enc(sha512(data)).c_str());
    printf("SHA512 data only  :\t%s\n",sha512(data).c_str());

    return 0;
}

And the output is:

Raw data to base36:     h55o0dxfmj4
SHA512 data to b36:     clldzg9hyfl5ihp0taww8rny0jxvz67rsk1w4og26zqyt4hdrya68yme09iwtew0tdq6aro9rk3jy2m3r2zpegumccc8ssrbnfr
SHA512 data only  :     4f3aff747e9ce090e5dfc5f23ce2a37233a21cfa2db7db70c984bc9ff8b263c9e02a6a485455c8042d10112f659a965e0bbf9645ee0c0e0c0824970dd879f667

Here is PHP code:

<?php
$data = "deadbeef1234d0";
echo "Raw data to base36: <b>".base_convert($data, 16, 36)."</b><br>";
echo "SHA512 data to b36: <b>".base_convert(hash(sha512,$data), 16, 36)."</b><br>";
echo "SHA512 data only  : <b>".hash(sha512,$data)."</b>";

And the PHP output is:

Raw data to base36: h55o0dxfmj4
SHA512 data to b36: g8804wccs0kc8w8ckkogoc8ssgcs8ccc0sgssgs4g0gok8k8kkgss0og44swkwsc
SHA512 data only  : 4f3aff747e9ce090e5dfc5f23ce2a37233a21cfa2db7db70c984bc9ff8b263c9e02a6a485455c8042d10112f659a965e0bbf9645ee0c0e0c0824970dd879f667

The encrypted string converted to base36 differs in PHP and C++. But not encryption causes a problem (hashing results always match). If to change the last character in data to 1 or any hex digit higher than 0, the raw data output will differ, and I can't understand why!

For example, if the data is "deadbeef1234df", the C++ output of raw-data-to-base36 will be "h55o0dxfmjj" and the PHP output will be "h55o0dxfmjk".

Could someone help me to find the reason of this "magic"?


Solution

  • In PHP the base_convert function converts a "number" between arbitrary bases and may lose precision on large numbers due to properties related to the internal "double" or "float" type used. Read more.

    Try this function instead of base_convert:

    Code

    function bignumber_base_convert($str, $frombase = 10, $tobase = 36)
    {
        $str = trim($str);
        if (intval($frombase) != 10) {
            $len = strlen($str);
            $q = 0;
            for ($i = 0; $i < $len; $i++) {
                $r = base_convert($str[$i], $frombase, 10);
                $q = bcadd(bcmul($q, $frombase), $r);
            }
        } else {
            $q = $str;
        }
    
        if (intval($tobase) != 10) {
            $s = '';
            while (bccomp($q, '0', 0) > 0) {
                $r = intval(bcmod($q, $tobase));
                $s = base_convert($r, 10, $tobase) . $s;
                $q = bcdiv($q, $tobase, 0);
            }
        } else {
            $s = $q;
        }
    
        return $s;
    }
    

    Usage

    echo bignumber_base_convert(hash('sha512',$data), 16, 36) . "\n";
    

    Result

    clldzg9hyfl5ihp0taww8rny0jxvz67rsk1w4og26zqyt4hdrya68yme09iwtew0tdq6aro9rk3jy2m3r2zpegumccc8ssrbnfr