What forms of memory address spaces have been used?
Today, a large flat virtual address space is common. Historically, more complicated address spaces have been used, such as a pair of a base address and an offset, a pair of a segment number and an offset, a word address plus some index for a byte or other sub-object, and so on.
From time to time, various answers and comments assert that C (or C++) pointers are essentially integers. That is an incorrect model for C (or C++), since the variety of address spaces is undoubtedly the cause of some of the C (or C++) rules about pointer operations. For example, not defining pointer arithmetic beyond an array simplifies support for pointers in a base and offset model. Limits on pointer conversion simplify support for address-plus-extra-data models.
That recurring assertion motivates this question. I am looking for information about the variety of address spaces to illustrate that a C pointer is not necessarily a simple integer and that the C restrictions on pointer operations are sensible given the wide variety of machines to be supported.
Useful information may include:
This is a broad question, so I am open to suggestions on managing it. I would be happy to see collaborative editing on a single generally inclusive answer. However, that may fail to award reputation as deserved. I suggest up-voting multiple useful contributions.
Just about anything you can imagine has probably been used. The
first major division is between byte addressing (all modern
architectures) and word addressing (pre-IBM 360/PDP-11, but
I think modern Unisys mainframes are still word addressed). In
word addressing, char*
and void*
would often be bigger than
an int*
; even if they were not bigger, the "byte selector"
would be in the high order bits, which were required to be 0, or
would be ignored for anything other than bytes. (On a PDP-10,
for example, if p
was a char*
, (int)p < (int)(p+1)
would
often be false, even though int
and char*
had the same
size.)
Among byte addressed machines, the major variants are segmented and non-segmented architectures. Both are still wide spread today, although in the case of Intel 32bit (a segmented architecture with 48 bit addresses), some of the more widely used OSs (Windows and Linux) artificially restrict user processes to a single segment, simulating a flat addressing.
Although I've no recent experience, I would expect even more variety in embedded processors. In particular, in the past, it was frequent for embedded processors to use a Harvard architecture, where code and data were in independent address spaces (so that a function pointer and a data pointer, cast to a large enough integral type, could compare equal).