Search code examples
pythonccpython

Does _PyString_Resize realloc memory?


I'm working on a Python C extension module (for CPython 2.5). It calls some underlying network API that fills a buffer.

Currently the code is written basically as follow:

PyObject * buffer;
char * cbuf;
size_t buffer_size = 1024;
int sz;
buffer = PyString_FromStringAndSize(NULL, buffer_size);
if (buffer == NULL) return NULL;
cbuf = PyString_AsString(buffer);
Py_BEGIN_ALLOW_THREADS
sz = read(cbuf, buffer_size);
Py_END_ALLOW_THREADS
if (sz > 0 &&  sz != buffer_size && _PyString_Resize(&buffer, sz) < 0)
        return NULL;

As far as I know this code works fine, but I wonder of the internals of _PyString_Resize. If sz is smaller than buffer_size, does it use the existing buffer of does it reallocate memory ?

From an efficiency point of view I would probably prefer the former to avoid a useless copy of buffer content even if it consumes more memory than necessary. On the other hand reallocating memory may also have it's point to reduce memory footprint.

So which one does _PyString_Resize does ? And is there an easy way to control this kind of behavior ?


Solution

  • Yes, _PyString_Resize does realloc - after all, this is what you asked it to do :-)

    If you want to save the reallocation, perhaps you can read into a buffer on the stack and then just create the string object from it. Something like (not compiled & tested, so treat it as pseudocode):

    char cbuf[BUFFER_SIZE];
    int sz = read(cbuf, BUFFER_SIZE);
    PyObject * buffer = PyString_FromStringAndSize(cbuf, sz);
    

    Also, note the warning above the implementation of _PyString_Resize (it's in Objects/stringobject.c):

    The following function breaks the notion that strings are immutable:
    it changes the size of a string. We get away with this only if there is only one module referencing the object. You can also think of itas creating a new string object and destroying the old one, only more efficiently. In any case, don't use this if the string may already be known to some other part of the code...