I have a module method which takes in a python list, and then outputs the same list with all items multiplied by 100.
I've attemped to follow the C intro here as close as possible but still running into issues.
static PyObject *
test_update_list(PyObject *self, PyObject *args)
{
PyObject *listObj = NULL;
PyObject *item = NULL;
PyObject *mult = PyLong_FromLong(100);
PyObject *incremented_item = NULL;
if (!PyArg_ParseTuple(args, "O", &listObj))
{
return NULL;
}
/* get the number of lines passed to us */
Py_ssize_t numLines = PyList_Size(listObj);
/* should raise an error here. */
if (numLines < 0) return NULL; /* Not a list */
for (Py_ssize_t i=0; i<numLines; i++) {
// pick the item
item = PyList_GetItem(listObj, i);
if (mult == NULL)
goto error;
// increment it
incremented_item = PyNumber_Add(item, mult);
if (incremented_item == NULL)
goto error;
// update the list item
if (PyObject_SetItem(listObj, i, incremented_item) < 0)
goto error;
}
error:
Py_XDECREF(item);
Py_XDECREF(mult);
Py_XDECREF(incremented_item);
return listObj;
};
The above complies fine, however when I run in ipython, I get the below error.
If I take away the error handling I get a seg fault.
---------------------------------------------------------------------------
SystemError Traceback (most recent call last)
SystemError: null argument to internal routine
The above exception was the direct cause of the following exception:
SystemError Traceback (most recent call last)
<ipython-input-3-da275aa3369f> in <module>()
----> 1 testadd.test_update_list([1,2,3])
SystemError: <built-in function ulist> returned a result with an error set
Any help is appreciated.
So you have a number of issues that all need to be corrected. I've listed them all under separate headings so you can go through them one at a time.
listObj
When you get an error in your for loop, you would goto
the error label, which was still returning the list. By returning this list you hide that there was an error in your function. You must always return NULL
when you expect your function to raise an exception.
listObj
ref count on returnWhen your function is invoked you are given a borrowed reference to your arguments. When you return one of those arguments you are creating a new reference to your list, and so must increment its reference count. Otherwise the interpreter will have a reference count that is one lower than the number of actual references to the object. This will end up with a bug where the interpreter deallocates your list when there is only 1 reference rather than 0! This could result in a seg fault, or it could in the worst case scenario result in random parts of the program access the that has since been deallocated and allocated for some other object.
PyObject_SetItem
with primitivePyObject_SetItem
can be used with dicts and other class that implements obj[key] = val
. So you cannot supply it with an argument of type Py_ssize_t
. Instead, use PyList_SetItem
which only accepts Py_ssize_t
as its index argument.
item
and incremented_item
PyObject_SetItem
and PyList_SetItem
both handle decreasing the reference count of the object that was already at the position that was being set. So we don't need to worry about managing the reference count of item
as we are only working with a reference borrowed from the list. These pair of functions also steal a reference to incremented_item
, and so we don't need to worry about managing its reference count either.
For example, when you call your function with an int. You will create a new reference to the 100 int object, but because you return NULL
rather than goto error
, this reference will be lost. As such you need to handle such scenarios differently. In my solution, I move the PyLong_FromLong
call to after the arg and type checking. In this way we are only create this new* object once we are guaranteed it will be used.
Side note: I removed the goto statements as there was only one left, and so it made more sense to do the error handling at that point rather than later.
static PyObject *
testadd_update_list(PyObject *self, PyObject *args)
{
PyObject *listObj = NULL;
PyObject *item = NULL;
PyObject *mult = NULL;
PyObject *incremented_item = NULL;
Py_ssize_t numLines;
if (!PyArg_ParseTuple(args, "O:update_list", &listObj))
{
return NULL;
}
if (!PyList_Check(listObj)) {
PyErr_BadArgument();
return NULL;
}
/* get the number of lines passed to us */
// Don't want to rely on the error checking of this function as it gives a weird stack trace.
// Instead, we use Py_ListCheck() and PyErr_BadArgument() as above. Since list is definitely
// a list now, then PyList_Size will never throw an error, and so we could use
// PyList_GET_SIZE(listObj) instead.
numLines = PyList_Size(listObj);
// only initialise mult here, otherwise the above returns would create a memory leak
mult = PyLong_FromLong(100);
if (mult == NULL) {
return NULL;
}
for (Py_ssize_t i=0; i<numLines; i++) {
// pick the item
// It is possible for this line to raise an error, but our invariants should
// ensure no error is ever raised. `list` is always of type list and `i` is always
// in bounds.
item = PyList_GetItem(listObj, i);
// increment it, and check for type errors or memory errors
incremented_item = PyNumber_Add(item, mult);
if (incremented_item == NULL) {
// ERROR!
Py_DECREF(mult);
return NULL;
}
// update the list item
// We definitely have a list, and our index is in bounds, so we should never see an error
// here.
PyList_SetItem(listObj, i, incremented_item);
// PyList_SetItem steals our reference to incremented_item, and so we must be careful in
// how we handle incremented_item now. Either incremented_item will not be our
// responsibility any more or it is NULL. As such, we can just remove our Py_XDECREF call
}
// success!
// We are returning a *new reference* to listObj. We must increment its ref count as a result!
Py_INCREF(listObj);
Py_DECREF(mult);
return listObj;
}
* PyLong_FromLong(100)
doesn't actually create a new object, but rather returns a new reference to an existing object. Integers with low values (0 <= i < 128
I think) are all cached and this same object is returned when needed. This is an implementation detail that is meant to avoid high levels of allocating and deallocating integers for small values, and so improve the performance of Python.