Search code examples
operating-systemkernelmallocheap-memoryosdev

Other than malloc/free does a program need the OS to provide anything else?


I'm working on designing the kernel (which I'm going to actually call the "core" just to be different, but its basically the same) for an OS I'm working on. The specifics of the OS itself are irrelevant if I can't get multi-tasking, memory management, and other basic things up and running, so I need to work on that first. I've some questinos about designing a malloc routine.

I figure that malloc() is either going to be a part of the kernel itself (I'm leaning towards this) or a part of the program, but I'm going to have to write my own implementation of the C standard library either way, so I get to write a malloc. My question is actually rather simple in this regard, how does C (or C++) manage its heap?

What I've always been taught in theorey classes is that the heap is an ever expanding piece of memory, starting at a specified address, and in a lot of senses behaving like a stack. In this way, I know that variables declared in global scope are at the beginning, and more variables are "pushed" onto the heap as they are declared in their respective scopes, and variables that go out of scope are simply left in memory space, but that space is marked as free so the heap can expand more if it needs to.

What I need to know is, how on earth does C actually handle a dynamically expanding heap in this manner? Does a compiled C program make its own calls to a malloc routine and handle its own heap, or do I need to provide it with an automatically expanding space? Also, how does the C program know where the heap begins?

Oh, and I know that the same concepts apply to other languages, but I would like any examples to be in C/C++ because I'm most comfortable with that language. I also would like to not worry about other things such as the stack, as I think I'm able to handle things like this on my own.

So I suppose my real question is, other than malloc/free (which handles getting and freeing pages for itself, etc) does a program need the OS to provide anything else?

Thanks!

EDIT I'm more interested in how C uses malloc in relation with the heap than in the actual workings of the malloc routine itself. If it helps, I'm doing this on x86, but C is cross compiler so it shouldn't matter. ^_^

EDIT FURTHER: I understand that I may be getting terms confused. I was taught that the "heap" was where the program stored things like global/local variables. I'm used to dealing with a "stack" in assembly programming, and I just realized that I probably mean that instead. A little research on my part shows that "heap" is more commonly used to refer to the total memory that a program has allocated for itself, or, the total number (and order) of pages of memory the OS has provided.

So, with that in mind, how do I deal with an ever expanding stack? (it does appear that my C theory class was mildly... flawed.)


Solution

  • malloc is generally implemented in the C runtime in userspace, relying on specific OS system calls to map in pages of virtual memory. The job of malloc and free is to manage those pages of memory, which are fixed in size (typically 4 KB, but sometimes bigger), and to slice and dice them into pieces that applications can use.

    See, for example, the GNU libc implementation.

    For a much simpler implementation, check out the MIT operating systems class from last year. Specifically, see the final lab handout, and take a look at lib/malloc.c. This code uses the operating system JOS developed in the class. The way it works is that it reads through the page tables (provided read-only by the OS), looking for unmapped virtual address ranges. It then uses the sys_page_alloc and sys_page_unmap system calls to map and unmap pages into the current process.