Search code examples
arrayscbit-manipulationadditionbigint

Adding two byte arrays which represent unsigned numbers together fails to add last carry in c


I have been writing a big numbers library for practice reasons in C and the addition operation is causing me some issues. It works normally but ignores the carry at the end and the result is incorrect.

I have been trying to write a big number library in c and here is the code so far for the addition operation. Code so far:

The BIG_INT type is a typedef for a unsigned char array of 8 bytes. 8 bytes is used for testing, it will be like 512 bytes later.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

#define BIG_INT_SIZE 8
typedef unsigned char BIG_INT[BIG_INT_SIZE];

int add_big_int(BIG_INT *left, size_t right)
{
    if (!right)
        return 0;

    char buffer[sizeof(size_t)];
    memcpy(buffer, &right, sizeof(size_t));

    unsigned int carry = 0;
    for (int64_t i = BIG_INT_SIZE - 1; i >= 0; --i) {
        unsigned int sum = (*left[i] & 0xff) + (buffer[i] & 0xff) + carry;
        *left[i] = sum;
        carry = sum >> 8;
    }

    printf("carry: %u\n", carry);
    return 0;
}

int main() 
{
    BIG_INT big = { 0, 0, 0, 0, 0, 0, 0, 1 };
    add_big_int(&big, 255);
    return 0;
}

When adding numbers like 256 + 1 = 257, the byte array is [0, 0, 0, 0, 0, 0, 1, 1] (reverse order), however when adding 255 + 1 = 256, the byte array is [0, 0, 0, 0, 0, 0, 0, 0] instead of the expected [0, 0, 0, 0, 0, 0, 1, 0] and the carry variable is set to 1.


Solution

  • The code is disorganized, and some organization will help fix it.

    Look at this loop: for (int64_t i = BIG_INT_SIZE - 1; i >= 0; --i). That iterates through BIG_INT_SIZE elements, using left[i] and buffer[i]. How many elements are in buffer?

    buffer is declared char buffer[sizeof(size_t)];, so, if sizeof(size_t) is different from BIG_INT_SIZE, the loop is massively broken. You either need buffer and left to have the same number of elements or you need code in the loop to handle disparate sizes.

    Let’s make them the same size, using char buffer[BIG_INT_SIZE];.

    Next, compare these two declarations:

    typedef unsigned char BIG_INT[BIG_INT_SIZE];
    
       char buffer[BIG_INT_SIZE];
    

    What is different? One uses unsigned char, and the other uses char. The C standard does not say whether char is signed or unsigned. When you put 255 in signed char, it is likely to take on the value −1, although this is implementation-defined. You want unsigned char:

       unsigned char buffer[BIG_INT_SIZE];
    

    Next, you have memcpy(buffer, &right, sizeof(size_t));, but we need to account for the change in how the size of buffer is set. Also, memcpy copies the bytes in whatever order they are in memory, but you want the low bytes at higher addresses, judging from your other code. So what you can do is:

       for (int i = 0; i < sizeof right; ++i)
       {
           buffer[BIG_INT_SIZE - 1 - i] = right & 0xff;
           right >>= 8;
       }
       for (int i = sizeof right; i < BIG_INT_SIZE; ++i)
           buffer[BIG_INT_SIZE - 1 - i] = 0;
    

    The first loops puts the low-value byte of right into the high-address element of buffer and then shifts the bytes of right right. The second loop fills in any further elements of buffer with zeros.

    A potential hazard here is that right is larger than buffer, which would make the first loop overrun the buffer. We can guard against that with:

        _Static_assert(sizeof right <= sizeof buffer, "right is too big for buffer");
    

    I used & 0xff above, mimicking your existing code. Often, people would design for an eight-bit byte, in which case the & 0xff is unnecessary; the assignment could be just buffer[BIG_INT_SIZE - 1 - i] = right;, and the automatic conversion of right to unsigned char will effectively remove higher bits. If you make that change, you may want to insert a _Static_assert that CHAR_BIT is eight. CHAR_BIT is defined in <limits.h>.

    Next, look at unsigned int sum = (*left[i] & 0xff) + (buffer[i] & 0xff) + carry;. Array subscripting has higher precedence than pointer dereferencing, so *left[i] is *(left[i]), not (*left)[i]. left is a pointer to BIG_INT, and just one BIG_INT is passed, so you cannot use left[i] with i being non-zero. You need (*left) to get the BIG_INT and then [i] to select an element from it, so you want unsigned int sum = ((*left)[i] & 0xff) + (buffer[i] & 0xff) + carry;.

    Similarly, *left[i] = sum; should be (*left)[i] = sum;.

    With those changes made, your code will get the correct answer for the case you show. However, you should consider other design aspects, including:

    • Do you want to define BIG_INT as an array? It may be safer as a structure containing an array. This would have caught the *left[i] error earlier?
    • Why is the type of right size_t? size_t has a connotation that it is used with object sizes, not arbitrary arithmetic values that might be used with big-integer types. Do you want to use a signed type for it?
    • Why does the loop use int64_t for i? That is a fixed-width type. You might want a type like ptrdiff_t, which is suitable for indexing arrays (it is signed, so it can be used with an i >= 0 test, and it is a type designed to adjust its width to the target platform.)
    • What will you do when the final carry is non-zero?
    • Will add_big_int always return zero, or do you have some future plan for it?