Search code examples
cforkfreecopy-on-write

Is a call to free() in the forked process causing a copy-on-write?


From man fork(2):

   Under Linux, fork() is implemented using copy-on-write pages, so
   the only penalty that it incurs is the time and memory required
   to duplicate the parent's page tables, and to create a unique
   task structure for the child.

Therefore, in an example like this one:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main(void)
{
    char *data = malloc(100);

    snprintf(data, 100, "%s", "Hello world!");

    pid_t pid = fork();

    if (pid == -1)
    {
        perror("fork");
        exit(EXIT_FAILURE);
    }
    if (pid == 0)
    {
        // start of child process
        printf("I'm the child\n");
        // Do the child stuff ...
        puts(data);
    }
    else
    {
        // start of parent process
        printf("I'm the parent\n");
        // Do the parent stuff ...
        puts(data);
        // Wait for child process
        if (wait(NULL) == -1)
        {
            perror("wait");
            exit(EXIT_FAILURE);
        }
        free(data);
    }
    return 0;
}

for what I understand, if a resource is duplicated but not modified, it is not necessary to create a new resource and the memory reserved with malloc is not copied/duplicated in the forked process and instead we get a pointer to the memory reserved in the parent process.

On the other hand, if the resource is not freed in the child process, we get a memory leak:

==13024== HEAP SUMMARY:
==13024==     in use at exit: 100 bytes in 1 blocks
==13024==   total heap usage: 2 allocs, 1 frees, 1,124 bytes allocated
==13024== 
==13024== LEAK SUMMARY:
==13024==    definitely lost: 100 bytes in 1 blocks
==13024==    indirectly lost: 0 bytes in 0 blocks
==13024==      possibly lost: 0 bytes in 0 blocks
==13024==    still reachable: 0 bytes in 0 blocks
==13024==         suppressed: 0 bytes in 0 blocks
==13024== Rerun with --leak-check=full to see details of leaked memory
==13024== 
==13024== For lists of detected and suppressed errors, rerun with: -s
==13024== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13023== 
==13023== HEAP SUMMARY:
==13023==     in use at exit: 0 bytes in 0 blocks
==13023==   total heap usage: 2 allocs, 2 frees, 1,124 bytes allocated
==13023== 
==13023== All heap blocks were freed -- no leaks are possible
==13023== 
==13023== For lists of detected and suppressed errors, rerun with: -s
==13023== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Which indicates that you should call free() from the forked process.

    if (pid == 0)
    {
        // start of child process
        printf("I'm the child\n");
        // Do the child stuff ...
        puts(data);
        free(data);
    }

Is a call to free() in the forked process causing a copy-on-write?


Solution

  • Yes, it certainly does.

    Memory copy-on-write (CoW) happens on a different layer than malloc()/free().

    When a process is forked, the child process has all its mapped pages marked as shared from the parent (and thus read-only). When the child modifies a shared page, it triggers a page fault and only then does the operating system copy the data to another area in the physical RAM (and change the mapping for the process).

    malloc() and free() do not allocate physical RAM. They are memory management functions, with memory defined as "the (virtual) address space of a process". Thus, these C library functions keep track of an internal state of allocated memory chunks, and malloc() and free() only modifies these libc-internal data structures (with an exception of requesting more address space from the OS when malloc()-ing). Physical RAM allocation only happens at page fault, most commonly when a process accesses newly assigned memory for the first time.

    In this respect, yes. As free() must modify memory to mark a region as freed, it will write to the relevant region, and at the lower level cause a remapping (i.e. CoW).