Search code examples
phpc++memory-managementphp-extension

In a PHP extension, recommended way to return the value from and std::string


We have a simple PHP function, whose purpose is to call a C++ free function std::string callLibrary(std::string) and to return its std::string return value.

It currently looks like that:

PHP_FUNCTION(call_library)
{
    char *arg = NULL;
    size_t arg_len, len;
    if (zend_parse_parameters(ZEND_NUM_ARGS(), "s", &arg, &arg_len) == FAILURE)
    {
        return;
    }

    // Call underlying library
    std::string callResult = callLibrary(arg);
    zend_string * result = zend_string_init(callResult.c_str(), callResult.size(), 0);
    RETURN_STR(result);
}

We cannot find a reference manual describing the behaviours of zend_string_initor RETURN_STR(), the closest thing we have being: http://www.phpinternalsbook.com/php7/internal_types/strings/zend_strings.html

In particular, it states for the last parameter of zend_string_init

If you pass 0, you ask the engine to use a request-bound heap allocation using the Zend Memory Manager. Such allocation will be destroyed at the end of the current request. If you don’t do it yourself, on a debug build, the engine will shout at you about a memory leak you just created. If you pass 1, you ask for what we called a “persistent” allocation, that is the engine will use a traditional C malloc() call and will not track the memory allocation in any way.

It seems we want the 0 value, but does RETURN_STR() then free the allocated memory? (the text is a bit ambiguous, but it seems the destruction should be explicit) Is there a more idiomatic way to return such std::string value from a PHP extension function?


Solution

  • To answer your question, I'll talk a little about PHP memory allocation, then about your specific question.

    About PHP Memory Allocation

    When writing PHP extensions, there are two kinds of memory allocations you can perform:

    • tracked memory allocations
    • persistent memory allocations

    A tracked memory allocation is an optimization that allows the PHP engine to have some more control over raw memory allocation. The Zend memory manager (ZendMM) acts as a wrapper above the standard memory allocation libraries. This memory manager allows PHP to avoid memory leaks by cleaning up any tracked memory that has not been explicitly freed at the end of a request. Furthermore this allows the engine to enact memory limits (such as the php.ini setting memory_limit). Tracked memory is also referred to as per-request memory for these reasons.

    A persistent memory allocation is the standard memory allocation managed by the C library (e.g malloc and friends). It's worth noting also that in C++, new and delete typically call down to malloc and free respectively. In PHP, a persistent memory allocation survives the handling of a request, and it may exist to service more than one request. For this reason, it is possible to cause a memory leak with these kinds of allocations.

    In the PHP API, there are some macros that are defined for performing tracked or persistent memory allocations. For example, emalloc and efree are the analogs to malloc and free for tracked (i.e. per-request) memory management. The macros pemalloc and pefree are for either tracked or persistent allocations, having a parameter to toggle. For example, pemalloc(32,1) allocates a block of 32 persistent bytes whereas pemalloc(32,0) is equivalent to emalloc(32) which allocates a block of 32 tracked bytes.

    In addition to the raw memory allocation functions, the PHP API also provides control over memory allocations initiated by higher-level functions. For example, creating a PHP7 zend_string structure with zend_string_init lets you choose what kind of memory allocation you want via the third parameter. This follows a common idiom throughout the API, with 0 indicating a tracked allocation and 1 indicating a persistent allocation.

    Regarding zend_string, zend_string_init and RETURN_STR

    I'm not as familiar with PHP7 as I am with PHP5, but many concepts have carried over and I think I've read enough of the source code to answer the question. When you assign a zend_string to a zval using RETURN_STR, the zval becomes responsible for freeing the zend_string (in PHP5 this was just a char* but the concept is the same). The Zend engine expects most if not all objects to be allocated using the tracked memory manager. In PHP5, a string assigned to a zval must have been allocated via emalloc since the code always calls efree on the string buffer when the zval is destroyed. In PHP7 there seems to be an exception since the zend_string structure can remember which allocation type was used. Regardless, it's a good practice to always use the tracked allocations as a default unless you have a good reason to do otherwise. Therefore your current code looks good since it passes 0 as the third parameter to zend_string_init.

    The destruction of the zend_string should not be explicit in your code since the zval will handle that at some later time. Plus, that process is dependent on how userspace operates on the returned zval. This is not something you have to worry about.