c memory-management size portability memcpy

size_t portability concept

I was analyzing on how the size_t offers portability across platforms, in the reference http://www.embedded.com/electronics-blogs/programming-pointers/4026076/Why-size-t-matters. The following are some few points that I derived out of it.

size_t is an typedef of an unsigned integral type.
The datatype which is used for type defining size_t ensures that, it is sufficient enough for accessing the size of the largest data object but not larger.

I am unable to understand what do they mean by the largest data object? Also consider the following example that I took from that reference.

For an I16LP32 they claim the size of the largest data object can be 2^(32)-1. Hence, in that case if we use an unsigned int as a substitute of size_t in memcpy function. We are limiting the processor capability to 2^(16)-1.

So what determines the size of largest data object?

Solution

The maximum possible size of an object is determined by the memory map of the architecture that the C implementation targets (in principle, and/or other implementation details, but in practice that's what does it).

On a so-called "flat" memory model, which includes most systems in use today, this means that objects can be almost the size of the whole address space (resources allowing), and so you'd expect size_t to be the size of a pointer. If your address space is 32 bits, then since each byte of the object has a different address you clearly cannot have an object larger than 2^32-1 bytes (-1 because the null pointer value "uses up" one address, which is guaranteed not to be the address of any object).

On a segmented memory architecture, it could be that your C implementation can address 2^32 bytes of space, but that a single object is not permitted to span multiple 16-bit segments, and so in principle size_t could be a 16 bit type.

For that matter, a C implementer is permitted to place arbitrary limit on malloc, out out sheer mischief, that it will never return a block bigger than 2^24-1 bytes. With the same restriction in the compiler to prevent static or automatic objects of that size, size_t could then be a 24 bit type even if pointers are bigger.

You'd hope that no implementer would do this solely for the fun of it, but there might be practical reasons for that 24 bit limit to exist. For example if the implementation uses a maximum of 16MB RAM for dynamic allocations and a separate limit of 16MB for static objects, then no object could be bigger than 16MB but a pointer has to be at least 25 bits. Even in that case, I doubt that an implementer would bother making size_t a 24-bit type, it would most likely still be 32 bits. But the standard allows the implementer to choose what's best.

You say, "it is sufficient enough for accessing the size of the largest data object but not larger". It's not true that it must not be any larger. For example, if some technical restriction of the platform means that no object can be larger than 2^31-1 bytes, it doesn't follow that size_t must be a 31-bit type. It is permitted to be a 32-bit type, because all that's required is that it's large enough.