Reconstruct USB CDC stream

I have an USB CDC interface on a STM32 to which I send messages which can be longer than one single chunk of 64 bytes. This means I get multiple callbacks of CDC_Receive_FS(uint8_t* Buf, uint32_t *Len) where I would have to copy the incoming data into another buffer (e.g. ringbuffer).

My problem is, how do I know if data belongs to a new message or is a continuation of a previous (larger) one? What I can tell is that if less than 64 bytes arrived I can assume that the message is complete. But if exactly 64 bytes are in the buffer I'd technically have to wait if there are no more messages coming. And even then, how long do I have to wait to not mix them up with a new one?

Solution

USB CDC and all serial protocols derived from RS-232 implement stream-based communication, i.e. a potentially endless stream of bytes.

They do not implement message-based communication. Therefore, they have no concept of messages, and no concept of message start and message end.

The lower layers of USB are message based. So you might observe patterns that look like a message-based communication. You might also think that USB CDC is message based because the STM32cube framework exposes a USB API that will deliver more data whenever a low-level USB message has arrived.

But this behavior can only be observed if small chunks of data are sent with pauses in-between and if the USB bus is mostly idle. It collapses if speed is increased or the USB bus becomes busier.

Then your PC will start to merge chunks of data. This can be easily tested by sending 100 times 10 bytes as fast as possible. The first 10 bytes are likely sent in its own packet. But the remaining data will be merged and sent as packets of 64 bytes (except for the last few bytes).

So if you want to have a message-oriented protocol on top of stream-oriented protocol, the two typical approaches are:

Use a delimiter between messages. As an example: If you have a human-readable text protocol, a linefeed is often used as the delimiter. Once you encounter a linefeed character, you know you have a complete message (i.e. line).
Use a length indicator at the end of the message. This is useful for binary protocols. The first two bytes could contain the message length in bytes, encoded as a 16-bit number. So will be obvious when the message is complete.

Also note that a received packet can contain several message, or the end of a message and the start of the next message.