Search code examples
ccastingpointer-arithmetic

c reinterpret pointer to datatype with bigger size


I'm trying to interpret WebSocket Frames that I get over a TCP connection. I want to do this in pure C (so no reinterpret_cast). The Format is specified in IEEE RFC 6455. I want to fill the following struct:

typedef struct {
    uint8_t flags;
    uint8_t opcode;
    uint8_t isMasked;
    uint64_t payloadLength;
    uint32_t maskingKey;
    char* payloadData;
} WSFrame;

with the following Function:

static void parseWsFrame(char *data, WSFrame *frame) {
    frame->flags = (*data) & FLAGS_MASK;
    frame->opcode = (*data) & OPCODE_MASK;
    //next byte
    data += 1;
    frame->isMasked = (*data) & IS_MASKED;
    frame->payloadLength = (*data) & PAYLOAD_MASK;

    //next byte
    data += 1;

    if (frame->payloadLength == 126) {
        frame->payloadLength = *((uint16_t *)data);
        data += 2;
    } else if (frame->payloadLength == 127) {
        frame->payloadLength = *((uint64_t *)data);
        data += 8;
    }

    if (frame->isMasked) {
        frame->maskingKey = *((uint32_t *)data);
        data += 4;
    }else{
        //still need to initialize it to shut up the compiler
        frame->maskingKey = 0;
    }
    frame->payloadData = data;
}

The code is for the ESP8266, so debugging is only possible with printfs to the serial console. Using this method, I discovered that the code crashes right after the frame->maskingKey = *((uint32_t *)data); and the first two ifs get skipped, so this is the first time I cast a pointer to another pointer.

The data is not \0 terminated, but i get the size in the data received callback. In my test, I'm trying to send the message 'test' over the already established WebSocket, and the received data length is 10, so:

  • 1 byte flags and opcode
  • 1 byte masked and payload length
  • 4 bytes masking key
  • 4 bytes payload length

At the point the code crashes, I expect data to be offsetted by 2 bytes from the initial position, so it has enough data to read the following 4 bytes.

I did not code any C for a long time, so I expect only a small error in my code.

PS.: I've seen a lot code where they interpret the values byte-by-byte and shift the values, but I see no reason why this method should not work either.


Solution

  • The problem with casting a char* to a pointer to a larger type is that some architectures do not allow unaligned reads.

    That is, for example, if you try to read a uint32_t through a pointer, then the value of the pointer itself has to be a multiple of 4. Otherwise, on some architectures, you will get a bus fault (e.g. - signal, trap, exception, etc.) of some sort.

    Because this data is coming in over TCP and the format of the stream / protocol is laid out without any padding, then you will likely need to read it out from the buffer into local variables byte by byte (e.g. - using memcpy) as appropriate. For example:

    if (frame->isMasked) {
        mempcy(&frame->maskingKey, data, 4);
        data += 4;
        // TODO: handle endianness: e.g.: frame->maskingKey = ntohl(frame->maskingKey);
    }else{
        //still need to initialize it to shut up the compiler
        frame->maskingKey = 0;
    }