Search code examples
c++filec++17radixlogarithm

How can I print a human-readable file size in C++ without a loop


I want to print a file size in C++. My input is in bytes and I want to print it in KiB if it gets over 1024, in MiB if it gets over 1024*1024, etc. Alternatively it should print in KB for 1000 and above, and so on.

It should also have a fractional part so that I can distinguish between 1.5 GiB and 1.2 GiB.

What I know is that I can use the logarithm to compute which unit to choose. So if log_1024(x) >= 1 then it should be in KiB. This way I could avoid an unnecessary loop.

Also, I have a function for printing the fractions already:

std::string stringifyFraction(unsigned numerator,
                              unsigned denominator,
                              unsigned precision);

Solution

  • The logarithm base 1000 or 1024 can indeed be used to determine the right unit. We actually just need the integral part of the logarithm, so the part in front of the decimal point. On modern hardware, the integer logarithm can be computed in O(1), so this will be slightly faster than using a for loop to get to the right unit. Here you can find out how to efficiently compute the integer logarithm of a number.

    If the integral part is 0, we print in B, for 1 in KiB, etc. We can create a lookup table where the key is our logarithm:

    constexpr const char FILE_SIZE_UNITS[8][3] {
        "B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB"
    };
    

    Note that the table uses 3 as an inner size because all strings are null-terminated. You might also be wondering why the lookup table doesn't contain KiB units. This is because the i in the middle is constant an doesn't need to be part of the table. Also, there are two different unit systems for file sizes, one which is base 1000 and one which is base 1024. See Files size units: “KiB” vs “KB” vs “kB”. We can easily support both in one function.

    We can then implement our stringifyFileSize method as follows:

    // use SFINAE to only allow base 1000 or 1024
    template <size_t BASE = 1024, std::enable_if_t<BASE == 1000 || BASE == 1024, int> = 0>
    std::string stringifyFileSize(uint64_t size, unsigned precision = 0)
    {
        static constexpr char FILE_SIZE_UNITS[8][3] {
            "B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB"
        };
    
        // The linked post about computing the integer logarithm
        // explains how to compute this.
        // This is equivalent to making a table: {1, 1000, 1000 * 1000, ...}
        // or {1, 1024, 1024 * 1024, ...}
        static constexpr auto powers = makePowerTable<Uint, BASE>();
    
        unsigned unit = logFloor<BASE>(size);
    
        // Your numerator is size, your denominator is 1000^unit or 1024^unit.
        std::string result = stringifyFraction(size, powers[unit], precision);
        result.reserve(result.size() + 5);
    
        // Optional: Space separating number from unit. (usually looks better)
        result.push_back(' ');
        char first = FILE_SIZE_UNITS[unit][0];
        // Optional: Use lower case (kB, mB, etc.) for decimal units
        if constexpr (BASE == 1000) {
            first += 'a' - 'A';
        }
        result.push_back(first);
    
        // Don't insert anything more in case of single bytes.
        if (unit != 0) {
            if constexpr (BASE == 1024) {
                result.push_back('i');
            }
            result.push_back(FILE_SIZE_UNITS[unit][1]);
        }
    
        return result;
    }