Search code examples
algorithmnumeral-system

What is the logic behind converting int to Excel [A-Z]+ column identifier?


I read this answer that gives an elegant solution to the question on how to convert an integer into Excel column [A-Z]+ representation.

However I failed to figure out how it works. To my understanding, the core of such question is a "base-26 numeral system w/o 0", but I couldn't find a way that connects it to the solution.

Could anyone be so kind to give me some intuitions on how to properly think of such questions, i.e., what is the pattern of such question? Hopefully with correct understanding of the pattern I can understand the solution too.

Thanks in advance.


Solution

  • Let's start with a very large number, such as 1179247502191623642585138.


    • In each iteration, we reduce the number using division and modulus operations.

    • The modulus operation (columnNumber - 1) % 26 gives a number between 0-25, which corresponds to the letters from 'A'-'Z'. This step essentially maps the number to one character, exactly from A to Z.

    • If we wanted a larger base (e.g., Base 52), we could use both uppercase and lowercase letters for the mapping.

    • The remainder from the mod % operation determines the character to append to the result string. For example, if columnNumber % 26 is 0, it maps to the letter 'A', 1 maps to 'B', ..., 25 maps to 'Z'.

    • After determining the character for the current step, we reduce the number by dividing it by 26. This basically shifts it to the next significant digit.

    • The formula used is: (columnNumber - 1) / 26. Subtracting 1 ensures that the column number starts at 1 for the letter 'A', as Excel columns are 1-indexed.

    • The result string is built by prepending each computed character to the current string. This is important because the characters represent digits in a right-to-left base-26 system.

    • For example, in the first iteration, 'A' might be computed, but in subsequent iterations, 'B', 'C', etc., are added to the front of the result string.

    • This process goes on and on until the number becomes 0. At this point, the full column name has been constructed; and we exit the while loop.

    • The Excel column name corresponding to 1179247502191623642585138 is therefore AAABBBCCCDDDEEEFFF.


    In simple words, Base-26 conversion is a system where each digit is simply mapped into a letter (A-Z). The column number logarithmically decreases with each iteration until the resulting string is built from right to the left.

    Example

    using System;
    using System.Numerics;
    
    BigInteger columnNumber = BigInteger.Parse("1179247502191623642585138");
    
    string res = GetExcelColumnName(columnNumber);
    Console.WriteLine($"\n--------\nColumn {columnNumber} -> {res}"); 
    
    string GetExcelColumnName(BigInteger columnNumber)  // Change to BigInteger for large numbers
    {
        string columnName = "";
        Console.WriteLine($"Begin: columnNumber = {columnNumber}\n--------\n");
    
        while (columnNumber > 0)
        {
            // Use BigInteger for modulo for consistency
            BigInteger modulo = (columnNumber - 1) % 26;  
            // Convert modulo to int for char conversion
            columnName = Convert.ToChar('A' + (int)modulo) + columnName;  
            // Use BigInteger for division
            columnNumber = (columnNumber - 1) / 26;  
    
            Console.WriteLine($"columnNumber: {columnNumber}, modulo: {modulo}, columnName: {columnName}");
        }
    
        return columnName;
    }
    
    

    Printing:

    Begin: columnNumber = 1179247502191623642585138
    --------
    
    columnNumber: 45355673161216293945582, modulo: 5, columnName: F
    columnNumber: 1744448967739088228676, modulo: 5, columnName: FF
    columnNumber: 67094191066888008795, modulo: 5, columnName: FFF
    columnNumber: 2580545810264923415, modulo: 4, columnName: EFFF
    columnNumber: 99251761933266285, modulo: 4, columnName: EEFFF
    columnNumber: 3817375458971780, modulo: 4, columnName: EEEFFF
    columnNumber: 146822133037376, modulo: 3, columnName: DEEEFFF
    columnNumber: 5647005116822, modulo: 3, columnName: DDEEEFFF
    columnNumber: 217192504493, modulo: 3, columnName: DDDEEEFFF
    columnNumber: 8353557865, modulo: 2, columnName: CDDDEEEFFF
    columnNumber: 321290687, modulo: 2, columnName: CCDDDEEEFFF
    columnNumber: 12357334, modulo: 2, columnName: CCCDDDEEEFFF
    columnNumber: 475282, modulo: 1, columnName: BCCCDDDEEEFFF
    columnNumber: 18280, modulo: 1, columnName: BBCCCDDDEEEFFF
    columnNumber: 703, modulo: 1, columnName: BBBCCCDDDEEEFFF
    columnNumber: 27, modulo: 0, columnName: ABBBCCCDDDEEEFFF
    columnNumber: 1, modulo: 0, columnName: AABBBCCCDDDEEEFFF
    columnNumber: 0, modulo: 0, columnName: AAABBBCCCDDDEEEFFF
    
    --------
    Column 1179247502191623642585138 -> AAABBBCCCDDDEEEFFF
    

    We can also modify it for 52 Base:

    using System;
    using System.Numerics;
    
    BigInteger columnNumber = BigInteger.Parse("553949810732274270830732352828569361");
    
    string res = GetExcelColumnName(columnNumber);
    Console.WriteLine($"\n--------\nColumn {columnNumber} -> {res}");
    
    string GetExcelColumnName(BigInteger columnNumber)  // Change to BigInteger for large numbers
    {
        string columnName = "";
        Console.WriteLine($"Begin: columnNumber = {columnNumber}\n--------\n");
    
        while (columnNumber > 0)
        {
            // Use BigInteger for modulo for consistency
            BigInteger modulo = (columnNumber - 1) % 52;  
    
            // Convert modulo to char, A-Z (0-25) or a-z (26-51)
            if (modulo < 26)
            {
                columnName = Convert.ToChar('A' + (int)modulo) + columnName;
            }
            else
            {
                columnName = Convert.ToChar('a' + (int)(modulo - 26)) + columnName;
            }
    
            // Use BigInteger for division
            columnNumber = (columnNumber - 1) / 52;  
    
            Console.WriteLine($"columnNumber: {columnNumber}, modulo: {modulo}, columnName: {columnName}");
        }
    
        return columnName;
    }
    
    
    
    

    Printing:

    Begin: columnNumber = 553949810732274270830732352828569361
    --------
    
    columnNumber: 10652880975620659054437160631318641, modulo: 28, columnName: c
    columnNumber: 204863095685012674123791550602281, modulo: 28, columnName: cc
    columnNumber: 3939674917019474502380606742351, modulo: 28, columnName: ccc
    columnNumber: 75762979173451432738088591199, modulo: 2, columnName: Cccc
    columnNumber: 1456980368720219860347857523, modulo: 2, columnName: CCccc
    columnNumber: 28018853244619612698997260, modulo: 2, columnName: CCCccc
    columnNumber: 538824100858069474980716, modulo: 27, columnName: bCCCccc
    columnNumber: 10362001939578259134244, modulo: 27, columnName: bbCCCccc
    columnNumber: 199269268068812675658, modulo: 27, columnName: bbbCCCccc
    columnNumber: 3832101309015628378, modulo: 1, columnName: BbbbCCCccc
    columnNumber: 73694255942608238, modulo: 1, columnName: BBbbbCCCccc
    columnNumber: 1417197229665543, modulo: 1, columnName: BBBbbbCCCccc
    columnNumber: 27253792878183, modulo: 26, columnName: aBBBbbbCCCccc
    columnNumber: 524111401503, modulo: 26, columnName: aaBBBbbbCCCccc
    columnNumber: 10079065413, modulo: 26, columnName: aaaBBBbbbCCCccc
    columnNumber: 193828181, modulo: 0, columnName: AaaaBBBbbbCCCccc
    columnNumber: 3727465, modulo: 0, columnName: AAaaaBBBbbbCCCccc
    columnNumber: 71682, modulo: 0, columnName: AAAaaaBBBbbbCCCccc
    columnNumber: 1378, modulo: 25, columnName: ZAAAaaaBBBbbbCCCccc
    columnNumber: 26, modulo: 25, columnName: ZZAAAaaaBBBbbbCCCccc
    columnNumber: 0, modulo: 25, columnName: ZZZAAAaaaBBBbbbCCCccc
    
    --------
    Column 553949810732274270830732352828569361 -> ZZZAAAaaaBBBbbbCCCccc
    
    Time & Space Complexities:
    • O(N): N is the length of column string.