Search code examples
cassemblykernelvirtual-memoryosdev

How to remap a kernel


This is a theoretical question as I can't seem to find any reference on how to do this.

I am writing a small kernel and I already have virtual memory working.

I have defined my memory map (taking inspiration from https://www.kernel.org/doc/Documentation/x86/x86_64/mm.txt) and now I want to set up the layout.

I have the following function that maps physical pages to virtual addresses:

void mapMemory(void* virtualMemory, void* physicalMemory)

Given an arbitrary memory region (somebuffer in the example) defined by the following struct:

struct region {
    void * base;
    uint64_t size;
};

I can remap physical pages to arbitrary addresses (OFFSET in the example)

uint64_t address = somebuffer.base;
uint64_t size    = somebuffer.size;

uint64_t index = 0;
for (uint64_t i = address; i < (address+size); i += 0x1000) {
    mapMemory((void*)OFFSET+index, (void*)i);
    index += 0x1000;
}

and then manually fix the pointers like this:

somebuffer.base = (void*)OFFSET;

This way I can still access somebuffer.

What I want is to remap the entire runtime (kernel segments, stack, EFI runtime services, etc) but I'm lacking that part of osdev theory. So, any help would be greatly appreciated.

Some ideas I had so far on how to solve this:

  • For the stack: just add the offset to rbp and rsp using embedded assembly.
  • For the kernel: modifying the entries in the GDT.
  • For the kernel: altering the segment registers and doing a far jump.

Thank you in advance!

P.S. You can check a working solution in the answer below.


Solution

  • The stack is relative so there is no requirement to add stack offsets or anything. The requirement on the stack depends on the use of the frame pointer or not. With gcc, you can specify not to use a frame pointer with -fomit-frame-pointer. The difference with this option is that the local variables will be accessed by using a relative positive offset from RSP. If you do use a frame pointer, then local variables will be accessed using a relative negative offset from RBP.

    The problem with changing the stacks position in your kernel revolves around the stack frame's allocation. If your compiler allocates a certain amount of stack for your main function and you change the stacks position in the main function with a too small stack frame, then this will break your kernel down the road. The problem is that the code itself will access a region of the stack that is not supposed to be accessed outside of the stack frame (maybe it is not paged or it contains some other important code/data). Also, when another function is called, the new function might overwrite some data of the main function that the main function has already initialized and used.

    On the same line of thought, if you change the position of the stack in the main function, then you need to make sure that anything previously initialized is not used further in the function. This is because your stack is now somewhere else so you cannot consider that the data found in your stack frame is the same (unless you rebuild it manually).

    Normally, this is the essence of paging. You can change your stack's considered position in virtual memory and have that region point to any position within physical memory (including its previous position).

    The kernel's position also doesn't need to be changed. What is normally done is that the kernel's code is adjusted to expect running in the top half of the virtual address space. The kernel is loaded somewhere low in physical memory and the page tables adjusted so that the addresses contained in the kernel's code point to that position.

    There are other things that you mention which don't seem to make sense. If you modify the GDT, it doesn't have an impact on paging. Actually, the GDT should probably be modified on exit from UEFI but it should only be to "take control" of the GDT (knowing its entry numbers, etc). The GDT should still represent a flat memory model spanning the whole of the address space. Anyway, as to what I know, the limit in the GDT is ignored in long mode. You should really just use paging.

    If you want to modify the GDT on UEFI exit, then you are right that you have to modify the segment registers but for CS you need a far return. Long mode doesn't support far jumps. I don't remember the details but you need the new CS value and the return address on the stack before the far return instruction.

    You also cannot remap UEFI runtime services as these are gone once you exit the UEFI environment. UEFI works similarly to a small kernel. Once you modify the execution environment, the small kernel breaks (you change IA32_LSTAR, you change syscall numbers, you change the GDT, etc). In any case, if you wanted to use these services in a custom kernel, you'd need to know what are the syscall numbers in the UEFI standard and manually call them with inline assembly which isn't very effective. You are better forgetting about those once you exit the UEFI environment and concentrate on building your own execution environment from scratch along with your own syscalls and drivers.