So lately, I am doing a project in C, where I implement a few type of trees, to be able to use them in python, since a C btree implementation is way faster than a Python one. (Ikr there are a few available libraries, but since I have more free time during the lockdown, I thought, I could do my own library.)
Everything is working fine, untill I would like to find an element, and print its value, all at the same line. When I do that, my Node
object turns into a Bytes
object, without any further assignment.
OS: Ubuntu 16.04.
Python version: Python 3.5.2
GCC: 5.4.0-6 Ubuntu
import mylib
import random
maxv = 10
def addValues(tree, values):
for value in values:
tree.insert(value)
def main():
my_list = [i for i in range(maxv)]
#random.shuffle(my_list)
tree = mylib.BinaryTree('test')
addValues(tree, my_list)
print(tree.find(3).getValue())
print(tree.sort(False))
main()
Expected Output (Works if line before the last in main func is print(tree.find(3))
):
3
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Output I got with the above test code:
3
Segmentation fault
The Segmentation fault occurs, because the node contains the value 3
became a Bytes
object while printing its value. The below code will print the new Node
.
import mylib
import random
maxv = 10
def addValues(tree, values):
for value in values:
tree.insert(value)
def main():
my_list = [i for i in range(maxv)]
#random.shuffle(my_list)
tree = mylib.BinaryTree('test')
addValues(tree, my_list)
print(tree.find(0).getValue()) #0 is going to be the root node's value, since random.shuffle is not happening.
print(tree.root)
print(tree.sort(False))
main()
Output:
0
b'0'
Segmentation fault
I spent days debugging this, and since I am definetly not a master in C programming, (You will see below), I could not find the error. I would like to move on, and implement more features, but I was unable to find the error. There is a chance I missed something so trivial, or a thing I am not aware of. I am not an experienced C programmer sadly. :/
My module contains much more code, I will not post what is not necessary for this problem. If you think you should see more to understand my code, feel free to tell me! Also I hope my code is well-readable!
Can someone explain what is happening exactly?
Thank you!
Recursive find function:
/** find node by value */
/*static*/ PyObject *find(Node *root, int value) {
if((PyObject *) root == Py_None) {
return NULL;
}
if(root->value == value) {
return (PyObject *) root;
}
if(value < root->value) {
return find((Node *) root->left, value);
}
if(value > root->value) {
return find((Node *) root->right, value);
}
}
Controller, this method is being called from Python, and this will call the above recursive function:
/** Find node by value */
/*static*/ PyObject *BinaryTree_Find(BinaryTree *self, PyObject *args, PyObject *kwds) {
int value;
PyObject *result;
if(!PyArg_ParseTuple(args, "i", &value)) {
return NULL;
}
result = find((Node *) self->root, value);
if(result == NULL) {
result = Py_None;
}
return (PyObject *) result;
}
You can use the Python code above to drive this module, and the below code to create the library.
from distutils.core import setup, Extension
setup(
name = 'mylib', version = '1.0', \
ext_modules = [Extension('mylib',
[
'custom.c',
])])
And run by: sudo python3 build.py install
#include <Python.h>
#include "structmember.h"
/* Header Block */
#define MODULE_NAME "mylib"
#define MODULE_DOC "Custom Binary Tree Implementations in C for Python"
/** Node */
typedef struct {
PyObject_HEAD
int value;
PyObject *left;
PyObject *right;
} Node;
/** BinaryTree */
typedef struct {
PyObject_HEAD
int elem;
PyObject *name;
PyObject *root;
} BinaryTree;
/** NODE */
/** creating node, set inital values */
/*static*/ PyObject *newNode(PyTypeObject *type, PyObject *args, PyObject *kwds) {
Node *self;
self = (Node *) type->tp_alloc(type, 0);
if(self == NULL) {
return NULL;
}
self->value = 0;
self->left = NULL;
self->right = NULL;
return (PyObject *) self;
}
/** init function of node, set parameter as value */
/*static*/ int Node_Init(Node *self, PyObject *args, PyObject *kwds) {
int value;
if(!PyArg_ParseTuple(args, "i", &value)) {
return -1;
}
self->value = value;
self->left = Py_None;
self->right = Py_None;
return 0;
}
/** deallocation of node */
/*static*/ void Node_Dealloc(Node *self) {
if(self->left != Py_None) Py_DECREF(self->left);
if(self->right != Py_None) Py_DECREF(self->right);
Py_TYPE(self)->tp_free((PyObject *) self);
}
/** return value of node */
/*static*/ PyObject *Node_GetValue(Node *self, PyObject *args, PyObject *kwds) {
return Py_BuildValue("i", self->value);
}
/** node variables */
/*static*/ PyMemberDef Node_Members[] = {
{
"left",
T_OBJECT_EX,
offsetof(Node, left),
0,
"Left Child"
},
{
"right",
T_OBJECT_EX,
offsetof(Node, right),
0,
"Right Child"
}
};
/** node methods */
/*static*/ PyMethodDef Node_Methods[] = {
{
"getValue",
(PyCFunction) Node_GetValue,
METH_NOARGS,
"Get value of node"
},
{NULL}
};
/** node type def */
/*static*/ PyTypeObject Node_Type = {
PyVarObject_HEAD_INIT(NULL, 0)
.tp_name = "BinaryTree Node",
.tp_doc = "Custom Binary Tree Node Object",
.tp_basicsize = sizeof(Node),
.tp_itemsize = 0,
.tp_flags = Py_TPFLAGS_DEFAULT,
.tp_new = newNode,
.tp_init = (initproc) Node_Init,
.tp_dealloc = (destructor) Node_Dealloc,
.tp_members = Node_Members,
.tp_methods = Node_Methods,
};
/** CONTROLLER */
/** insert new node into the tree */
/*static*/ PyObject *insert(Node *root, int value, PyObject *args) {
if((PyObject *) root == Py_None) {
return PyObject_CallObject((PyObject *) &Node_Type, args);
}
if(root->value == value) {
return NULL;
}
if(value < root->value) {
root->left = insert((Node *) root->left, value, args);
}
if(value > root->value) {
root->right = insert((Node *) root->right, value, args);
}
return (PyObject *) root;
}
/** find node by value */
/*static*/ PyObject *find(Node *root, int value) {
if((PyObject *) root == Py_None) {
return NULL;
}
if(root->value == value) {
return (PyObject *) root;
}
if(value < root->value) {
return find((Node *) root->left, value);
}
if(value > root->value) {
return find((Node *) root->right, value);
}
}
/** sort tree asc / inorder traversal */
/*static*/ void sortAsc(Node *root, PyObject *list) {
if((PyObject *) root == Py_None) {
return;
}
sortAsc((Node *) root->left, list);
PyList_Append(list, Py_BuildValue("i", root->value));
sortAsc((Node *) root->right, list);
}
/** sort tree desc */
/*static*/ void sortDesc(Node *root, PyObject *list) {
if((PyObject *) root == Py_None) {
return;
}
sortDesc((Node *) root->right, list);
PyList_Append(list, Py_BuildValue("i", root->value));
sortDesc((Node *) root->left, list);
}
/** BinaryTree */
/** creating binary tree, set inital values */
/*static*/ PyObject *newBinaryTree(PyTypeObject *type, PyObject *args, PyObject *kwds) {
BinaryTree *self;
self = (BinaryTree *) type->tp_alloc(type, 0);
if(self == NULL) {
return NULL;
}
self->name = PyUnicode_FromString("");
if(self->name == NULL) {
Py_DECREF(self);
return NULL;
}
self->elem = 0;
self->root = Py_None;
return (PyObject *) self;
}
/** set parameters as variables */
/*static*/ int BinaryTree_Init(BinaryTree *self, PyObject *args, PyObject *kwds) {
PyObject *name, *temp;
if(!PyArg_ParseTuple(args, "O", &name)) {
return -1;
}
if(name) {
temp = self->name;
Py_INCREF(name);
self->name = name;
Py_XDECREF(temp);
}
self->elem = 0;
return 0;
}
/** deallocation of binary tree */
/*static*/ void BinaryTree_Dealloc(BinaryTree *self) {
Py_XDECREF(self->name);
Py_XDECREF(self->root);
Py_TYPE(self)->tp_free((PyObject *) self);
}
/** insert controller, set parameter as value, drive inserting, return true */
/*static*/ PyObject *BinaryTree_Insert(BinaryTree *self, PyObject *args, PyObject *kwds) {
int value;
PyObject *result;
if(!PyArg_ParseTuple(args, "i", &value)) {
return NULL;
}
result = insert((Node *) self->root, value, args);
if(result == NULL) {
Py_XDECREF(result);
PyErr_SetString(PyExc_TypeError, "Already existing value!");
return NULL;
}
self->root = result;
self->elem++;
Py_RETURN_TRUE;
}
/** Find node by value */
/*static*/ PyObject *BinaryTree_Find(BinaryTree *self, PyObject *args, PyObject *kwds) {
int value;
PyObject *result;
if(!PyArg_ParseTuple(args, "i", &value)) {
return NULL;
}
result = find((Node *) self->root, value);
if(result == NULL) {
result = Py_None;
}
return (PyObject *) result;
}
/** sort binary tree asc */
/*static*/ PyObject *BinaryTree_Sort(BinaryTree *self, PyObject *args, PyObject *kwds) {
int rev;
PyObject *list = PyList_New(0);
if(!PyArg_ParseTuple(args, "p", &rev)) {
Py_XDECREF(list);
return NULL;
}
switch(rev) {
case 0:
sortAsc( (Node *) self->root, list);
break;
case 1:
sortDesc((Node *) self->root, list);
break;
default:
sortAsc( (Node *) self->root, list);
}
return list;
}
/** binary tree variables */
/*static*/ PyMemberDef BinaryTree_Members[] = {
{
"name",
T_OBJECT_EX,
offsetof(BinaryTree, name),
0,
"BinaryTree's unique name"
},
{
"root",
T_OBJECT_EX,
offsetof(BinaryTree, root),
0,
"BinaryTree's root"
},
{NULL}
};
/** binary tree methods */
/*static*/ PyMethodDef BinaryTree_Methods[] = {
{
"insert",
(PyCFunction) BinaryTree_Insert,
METH_VARARGS,
"Insert new node."
},
{
"find",
(PyCFunction) BinaryTree_Find,
METH_VARARGS,
"Find node by value."
},
{
"sort",
(PyCFunction) BinaryTree_Sort,
METH_VARARGS,
"Sort tree."
},
{NULL}
};
/** binary tree type def */
/*static*/ PyTypeObject BinaryTree_Type = {
PyVarObject_HEAD_INIT(NULL, 0)
.tp_name = "BinaryTree",
.tp_doc = "Custom Binary Tree Object",
.tp_basicsize = sizeof(BinaryTree),
.tp_itemsize = 0,
.tp_flags = Py_TPFLAGS_DEFAULT,
.tp_new = newBinaryTree,
.tp_init = (initproc) BinaryTree_Init,
.tp_dealloc = (destructor) BinaryTree_Dealloc,
.tp_members = BinaryTree_Members,
.tp_methods = BinaryTree_Methods,
};
/** MODULE */
/** Module def */
static struct PyModuleDef mylib_module = {
PyModuleDef_HEAD_INIT,
MODULE_NAME, /* name of module */
MODULE_DOC, /* module documentation, may be NULL */
-1, /* size of per-interpreter state of the module */
};
/** Module init */
PyMODINIT_FUNC PyInit_mylib(void) {
PyObject *mod;
if(PyType_Ready(&BinaryTree_Type) < 0) {
return NULL;
}
if(PyType_Ready(&Node_Type) < 0) {
return NULL;
}
mod = PyModule_Create(&mylib_module);
if(mod == NULL) {
return NULL;
}
Py_INCREF(&BinaryTree_Type);
if(PyModule_AddObject(mod, "BinaryTree", (PyObject *) &BinaryTree_Type) < 0) {
Py_DECREF(&BinaryTree_Type);
Py_DECREF(mod);
return NULL;
}
Py_INCREF(&Node_Type);
if(PyModule_AddObject(mod, "Node", (PyObject *) &Node_Type) < 0) {
Py_DECREF(&Node_Type);
Py_DECREF(mod);
return NULL;
}
return mod;
}
print(tree.root.getValue())
print(tree.find(tree.root.value).getValue())
Works fine! The error lies in the find function!
Thanks to DavidW, he pointed out it must be a reference error somewhere, so I counted, how many references are there refering to that Node object before, and after printing it's value. The results:
2 # references before
56 # value before, this line accessed the Node.
b'56\n' # value after
139690028005977 # references after
Segmentation fault # segfault while sorting
So while using Find
, the Node's reference count gets decreased, therefore all Node
objects are deallocated/deleted. Increasing before returning, and custom setter/getter methods with INCREF
/DECREF
-s are also ineffective.
Following DavidW's suggestion, I managed to find the answer! I did not increased the reference count, therefore the Node
s where deallocated right after searching. The objects were not behaving the same way when I did not requested the Node's value, or when I did not print the value out.I suggest the print function decreased the reference count somehow.
Right find controller:
/** Find nood by value */
/*static*/ PyObject *BinaryTree_Find(BinaryTree *self, PyObject *args, PyObject *kwds) {
int value;
PyObject *result;
if(!PyArg_ParseTuple(args, "i", &value)) {
return NULL;
}
result = find((Node *) self->root, value);
if(result == NULL) {
result = Py_None;
}
Py_INCREF(result); // Increasing the Node object's reference count here
return (PyObject *) result;
}
As discussed in the comments it was a reference counting error. A C API function must return what Python calls a "new reference". That means either returning something that has been purposefully created inside the function (for example the result of PyList_New
) in incrementing the reference count of an existing object.
In BinaryTree_Find
specifically you were not returning a new reference. As a result Python would end up freeing an object that still formed part of your BinaryTree. Once that's happened you can get all kinds of strange and confusing behaviour. I suggested adding Py_INCREF(result)
.
To help diagnose this kind of problem it was be worth adding printf
statements to object constructors and destructors (just as a temporary debugging measure) so you can check your assumptions about when they are allocated and freed.