Search code examples
ctype-conversionfuzzing

Convert uint_8* to any type, say "double" or a struct, in C?


In C language, I have a piece of program like

void foo(A a);

here, the type A is known. It can be any type, int, pointer, or struct, or any user-written type.

Now I have a piece of data pointed by a pointer, say uint8_t* data of size n How can I convert data to a of type A?

I am working on this to test foo, from a random data of type uint8_t* and size n, using a fuzzing backend.


Solution

  • Convert uint8_t* to any type in C?

    Is not possible to do generically in C language. C language doesn't have reflection, and without it nothing can be said about "any type". Without knowing the "any type" object representation and without knowing the serialization method used to encode that object in an pointer to/array of uint8_t objects, it's not possible to generically auto-guess a conversion function.

    You may interpret the set of bytes pointed to by uint8_t*. Aliasing with a pointer will result in strict alias violation and access may not be aligned and may ultimately lead to undefined behavior. You could alternatively use memcpy (and this is most probably what you want actually to do):

    void foo(A a, size_t arrsize, uint8_t arr[arrsize]) {
        assert(arrsize >= sizeof(A)); // do some rudimentary safety checks
        memcpy(&a, arr, sizeof(A));
        // use a
        printf("%lf", a.some_member);
    }
    

    or use union to do type-punning, but that may result in a trap representation and may cause program to perform a trap, but ultimately you could be fine.

    The only proper way to actually convert an array of values to the destination typeis to actually write a deserialization/conversion function. The algorithm will depend on the object representation of the A type and the format and encoding of the source type (json? yaml? "raw"(?) bytes in big endian? little endian? MSB? LSB? etc..).

    Note that uint8_t represent a number that takes exactly 8 bytes, has a range of 0 to 255. In C to represent a "byte" use unsigned char type. unsigned char is specifically mentioned to have the smallest alignment requirement, sizeof equal to 1 and you can alias any object with a char* pointer.