Reduce malloc calls by slicing one big malloc'd memory

First, here is where I got the idea from:

There was once an app I wrote that used lots of little blobs of memory, each allocated with malloc(). It worked correctly but was slow. I replaced the many calls to malloc with just one, and then sliced up that large block within my app. It was much much faster.

I was profiling my application, and I got a unexpectedly nice performance boost when I reduced the number of malloc calls. I am still allocating the same amount of memory, though.

So, I would like to do what this guy did, but I am unsure what's the best way to do it.

My Idea:

// static global variables
static void * memoryForStruct1 = malloc(sizeof(Struct1) * 10000);
int struct1Index = 0;
...
// somewhere, I need memory, fast:
Struct1* data = memoryForStruct1[struct1Index++];
...
// done with data:
--struct1Index;

Gotchas:

I have to make sure I don't exceed 10000
I have to release the memory in the same order I occupied. (Not a major issue in my case, since I am using recursion, but I would like to avoid it if possible).

Inspired from Mihai Maruseac:

First, I create a linked list of int that basically tells me which memory indexes are free. I then added a property to my struct called int memoryIndex which helps me return the memory occupied in any order. And Luckily, I am sure my memory needs would never exceed 5 MB at any given time, so I can safely allocate that much memory. Solved.

Solution

The system call which gives you memory is brk. The usual malloc and calloc, realloc functions simply use the space given by brk. When that space is not enough, another brk is made to create new space. Usually, the space is increased in sizes of a virtual memory page.

Thus, if you really want to have a premade pool of objects, then make sure to allocate memory in multiples of pagesize. For example, you can create one pool of 4KB. 8KB, ... space.

Next idea, look at your objects. Some of them have one size, some have other size. It will be a big pain to handle allocations for all of them from the same pool. Create pools for objects of various sizes (powers of 2 is best) and allocate from them. For example, if you'll have an object of size 34B you'd allocate space for it from the 64B pool.

Lastly, the remaining space can be either left unused or it can be moved down to the other pools. In the above example, you have 30B left. You'd split it in 16B, 8B, 4B and 2B chunks and add each chunk to their respective pool.

Thus, you'd use linked lists to manage the preallocated space. Which means that your application will use more memory than it actually needs but if this really helps you, why not?

Basically, what I've described is a mix between buddy allocator and slab allocator from the Linux kernel.

Edit: After reading your comments, it will be pretty easy to allocate a big area with malloc(BIG_SPACE) and use this as a pool for your memory.