Search code examples
memoryprogramming-languagesbit-manipulation

Built in datatype with size less than 1 byte


Most strongly typed programming languages have data types of min. 1 byte in size. I know it is possible to access individual memory cells using bit masking but why programming languages not support data type of less than 1 byte ?


Solution

  • For languages that have manual memory management/address juggling at all, the hardware dictates some restrictions on those features. Very few, if any, architectures support addressing a single bit. Typically, the smallest unit of storage is a byte, so they use that.

    Making all addresses refer to bits either requires larger-than-average address representation (a performance hit - twice as many instructions for anything touching addresses) or vastly limit the available address space. Adding a special case (and special kind of address) complicates the language for something that is rarely needed (note that C has a related, but IMHO more general version: bitfields in structs - the structs still have a sizeof measured in bytes, but a struct with 8 members may be one byte large overall). Bit fiddling operators that are included anyway allow emulating it in user code.

    In higher-level languages that don't have a notion of addressing stuff at all, the size is an implementation detail. The implementation are, of course (directly or indirectly), again in lower-level languages that default to bytes over bits. That, and other requirements and limitations (e.g.: objects need to be accessed through pointers), make it impractical in general (though it exists, e.g. BitVector for Python) to expose tricks like "use a machine word, then index the bits through shifting/masking" to the language implemented.