Search code examples
phpmathencodinglogarithm

Encoding very large numbers into Strings


I have been trying to encode very large integers into sequences of capital letters using PHP. assigning values from 0 to 25 to A~Z I have shortened 1 billion down to DGEHTYM. if a working algorithm is achieved, using different cases, numbers and even symbols the resulting string could be far shorter. my goal however is just strings of capital letters, perhaps with numeric characters too to make it shorter but not a must.

to encode using pen, paper and my phone's calculator:

  • find the highest power of 26 lower than 1 billion (php has the function log for that). for 1 billion it's 6. so 26^6 = 308,915,776.
  • 1,000,000,000 by 308,915,776 is 3.237 with a lot of decimals
  • 3 is D. we got our first letter!
  • now we will do the same thing over and over until we get an integer: take away the whole part and multiply the decimals left by 26. every whole we take out is a letter. for 1 billion I looped 6 times until I got 12, which is M.

maybe there is a smarter way to accomplish this but I haven't figured it out.

perhaps I should've done it differently but for the lack of a better way I tried to replicate my calculations in PHP, but I had problems: 1: it's very difficult to encode numbers that encoded would end with A (0) such as 1326 which would be BZA and 2: I couldn't manage to get PHP to detect whether a number has or doesn't have decimals. I had to keep converting it into a string and looking for a dot.

decoding the string is easier. from right to left, you multiply the value of each letter by 26 elevated to the power of its position, except for the first one, them sum it all:

D => 3 x (26^6) = 926,747,328
G => 6 x (26^5) = 71,288,256
E => 4 x (26^4) = 1,827,904
H => 3 x (26^3) = 123,032
T => 19 x (26^2) = 12,844
Y => 24 x 26 = 624
M => 12
926,747,328 + 71,288,256 + 1,827,904 + 123,032 + 12,844 + 624 + 12 = 1,000,000,000

there is a similar question here but those who answered didn't seem to understand what the OP was trying to accomplish.


Solution

  • You're converting from base 10 to 26, this is in the range that built-in functions like base_convert or gmp_strval can handle. They'll give digits in the form of 0-9 and a-p, so you just need to translate those into what you're after with strtr.

    $translate = array_combine(
        array_merge(range(0, 9), range('a', 'p')),
        range('A', 'Z')
    );
    
    echo strtr(base_convert(1000000000, 10, 26), $translate), "\n";
    echo strtr(base_convert(1326, 10, 26), $translate), "\n";
    
    //Or if you're working with really big numbers (beyond PHP_INT_MAX):
    echo strtr(gmp_strval(1000000000, 26), $translate), "\n";
    echo strtr(gmp_strval(1326, 26), $translate);
    

    Output:

    DGEHTYM
    BZA
    DGEHTYM
    BZA
    

    Or to do it manually, which'll make it easy to add/remove symbols for different bases and encodings. Basically you divide your input number by the base, use the remainder to look up the appropriate symbol for that position, then repeat on the result of the division.

    function base10_to_whatevs($num, array $digits)
    {
        $result = '';
        $base = count($digits);
        while ($num > 0) {
            $remainder = $num % $base;
            $num = ($num - $remainder) / $base;
            $result = $digits[$remainder] . $result;
        }
        return $result;
    }
    
    $letters = range('A', 'Z');
    echo base10_to_whatevs(1000000000, $letters), "\n";
    echo base10_to_whatevs(1326, $letters);
    
    DGEHTYM
    BZA