Search code examples
x86cpu-architecturebootbiososdev

8086 Reset vector above 20 bits with buses of 20 bits


How can the cpu fetch instructions from the address 0xfffffff0 ( CS_base : 0xffff0000 + IP : 0xfff0) if it's above the 1mb limit of the 20 bit bus?

  • I understand that the cs register start with a base address of 0xffff0000
  • But i don't understand how can the bus communicate this address with only 20 bit bus

I have read the other posts; they only talk about the fact the the cs register is hardwired to get the 0xffff0000 base address, not about the bus limit


Solution

  • Originally (e.g. for 8088, 8086, 80186), physical addresses were 20-bits (giving 1 MiB of physical address space). At power-on and reset, CS:IP was set to "0xF000:0xFFF0 = 0xFFFF0", and the firmware's ROM was at the end of the physical address space (e.g. ending at physical address 0xFFFFF).

    Note that this value is not "above 20 bits" because of the way "segment:offset" was converted into a physical address (specifically, "physical = segment * 16 + offset").

    Also note that for physical addresses that didn't fit in 20 bits, the highest bit was discarded to make it fit in 20 bits (e.g. "0xFFFF:0xFFFE = 0xFFFF*16 + 0xFFFE = 0x10FFEE = 0x0FFEE"). This led to special hackery ("A20 gate") to preserve backward compatibility by disabling the 21st address line (A20) when 80286 was released (and the physical address size was increased to 24 bits giving 16 MiB of physical address space).

    The other change that happened for 80286 is that (to support protected mode) segment registers (which were originally just a 16 bit integer) gained some hidden pieces - primarily a hidden "segment base" value was added to segment registers, so that the (in protected mode) you have the visible value that was loaded into the segment register and the details of the segment (base address, limit) were loaded from a table (global descriptor table or local descriptor table) and not directly implied by the value loaded into the segment register, the physical address calculation was changed to use the hidden value (e.g. "physical = segment.base + offset"), and in real mode the segment base was set on segment loads (e.g. "segment.base = value * 16") so that it all worked the same when in real mode.

    Later (starting with 80386) the physical address space size was increased (first to 32-bit, then to 36-bit, then to "architectural maximum of 52 bits"). When this happened they changed the value loaded into hidden part of CS at power-on/reset. Specifically, the visible part remained the same (0xF000) but the hidden "segment base" part was set to 0xFFFF0000 so that it effectively becomes "0xFFFF0000+0xFFF0 = 0xFFFFFFF0". In addition, the firmware's ROM was shifted to the end of the 32-bit address space (e.g. ending at physical address 0xFFFFFFFF); and (for compatibility) a piece of the ROM was copied (possibly decompressed) and put in RAM at the old address (ending at physical address 0x000FFFFF), and the memory controller was configured such that writes to the "legacy ROM area" were ignored (so that it still behaved like ROM, but was faster because RAM was faster than ROM).

    Of course now (for UEFI, in theory and/or once "hybrid BIOS+UEFI" permanently ceases to exist next year) this can disappear - there's no need to copy a part of the ROM into the "legacy ROM area" and no need to configure the memory controller to ignore writes to that area; and we could just have large (3 GiB?) area of normal usable RAM from 0x00000000 to 0xBFFFFFFFF.