Search code examples
pythoncpython-c-api

SegFault when trying to write to a Numpy array created within a C Extension


I have an if clause within a for loop in which I have defined state_out beforehand with:

state_out = (PyArrayObject *) PyArray_FromDims(1,dims_new,NPY_BOOL);

And the if conditions are like this:

        if (conn_ctr<sum*2){
            *(state_out->data + i*state_out->strides[0]) =  true;
        }
        else {
            *(state_out->data + i*state_out->strides[0]) =  false;
        }

When commenting these out, state_out returns as an all-False Numpy array. There is a problem with this assignment that I fail to see. As far as I know, all within the struct PyArrayObject that are called here in this code are pointers, so after the pointer arithmetic, it should be pointing to the address I intend to write. (All if conditions in the code are built by reaching values in this manner, and I know it works, since I managed to printf input arrays' values.) Then if I want to assign a bool to one of these parts in the memory, I should assign it via *(pointer_intended) = true What am I missing?

EDIT: I have spotted that even if I don't reach those values even if I put some printf functions within:

if (conn_ctr<sum*2){
    printf("True!\n");
}
else {
    printf("False!\n");
}

I get a SegFault again.

Thanks a lot, an the rest of the code is here.

#include <Python.h>
#include "numpy/arrayobject.h"
#include <stdio.h>
#include <stdbool.h>

static PyObject* trace(PyObject *self, PyObject *args);

static char doc[] =
"This is the C extension for xor_masking routine. It interfaces with Python via C-Api, and calculates the"
"next state with C pointer arithmetic";

static PyMethodDef TraceMethods[] = {
    {"trace", trace, METH_VARARGS, doc},
    {NULL, NULL, 0, NULL}
};

PyMODINIT_FUNC
inittrace(void)
{
    (void) Py_InitModule("trace", TraceMethods);
    import_array();
}

static PyObject* trace(PyObject *self, PyObject *args){
    PyObject *adjacency ,*mask, *state;
    PyArrayObject *adjacency_arr, *mask_arr, *state_arr, *state_out;

    if (!PyArg_ParseTuple(args,"OOO:trace", &adjacency, &mask, &state)) return NULL;

    adjacency_arr = (PyArrayObject *)
        PyArray_ContiguousFromObject(adjacency, NPY_BOOL,2,2);

    if (adjacency_arr == NULL) return NULL;
    mask_arr = (PyArrayObject *)
        PyArray_ContiguousFromObject(mask, NPY_BOOL,2,2);

    if (mask_arr == NULL) return NULL;
    state_arr = (PyArrayObject *)
        PyArray_ContiguousFromObject(state, NPY_BOOL,1,1);

    if (state_arr == NULL) return NULL;

    int dims[2], dims_new[1];
    dims[0] = adjacency_arr -> dimensions[0];
    dims[1] = adjacency_arr -> dimensions[1];
    dims_new[0] =  adjacency_arr -> dimensions[0];
    if (!(dims[0]==dims[1] && mask_arr -> dimensions[0] == dims[0]
                         && mask_arr -> dimensions[1] == dims[0]
                         && state_arr -> dimensions[0] == dims[0]))
                         return NULL;


    state_out = (PyArrayObject *) PyArray_FromDims(1,dims_new,NPY_BOOL);

    int i,j;

    for(i=0;i<dims[0];i++){
        int sum = 0;
        int conn_ctr = 0;

            for(j=0;j<dims[1];j++){

                bool adj_value = (adjacency_arr->data + i*adjacency_arr->strides[0]
                         +j*adjacency_arr->strides[1]);

                if (*(bool *) adj_value == true){

                    bool mask_value = (mask_arr->data + i*mask_arr->strides[0]
                    +j*mask_arr->strides[1]);
                    bool state_value = (state_arr->data + j*state_arr->strides[0]);

                    if ( (*(bool *) mask_value ^ *(bool *)state_value) ==  true){
                        sum++;
                    }
                    conn_ctr++;
                }
            }

            if (conn_ctr<sum*2){

            }
            else {

            }
    }

    Py_DECREF(adjacency_arr);
    Py_DECREF(mask_arr);
    Py_DECREF(state_arr);
    return PyArray_Return(state_out);
}

Solution

  •     if (conn_ctr<sum*2){
            *(state_out->data + i*state_out->strides[0]) =  true;
        }
        else {
            *(state_out->data + i*state_out->strides[0]) =  false;
        }
    

    Here, I naively make a pointer arithmetic, state_out->data is a pointer to the beginning of data, it is defined to be a pointer of char:SciPy Doc - Python Types and C-Structures

    typedef struct PyArrayObject {
        PyObject_HEAD
        char *data;
        int nd;
        npy_intp *dimensions;
        npy_intp *strides;
        ...
    } PyArrayObject;
    

    Which a portion of I copied here. state_out->strides is a pointer to an array of length of the dimension of the array we have. This is a 1d array in this case. So when I make the pointer arithmetic (state_out->data + i*state_out->strides[0]) I certainly aim to calculate the pointer that points the ith value of the array, but I failed to give the type of the pointer, so the

    I had tried :

    NPY_BOOL *adj_value_ptr, *mask_value_ptr, *state_value_ptr, *state_out_ptr;
    

    which the variables are pointing towards the values that I am interested in my for loop, and state_out_ptr is the one that I am writing to. I had thought that since I state that the constituents of these arrays are of type NPY_BOOL, the pointers that point to the data within the array would be of type NPY_BOOL also. This fails with a SegFault when one is working with data directly manipulating the memory. This is from the fact that NPY_BOOL is an enum for an integer (as pv kindly stated in the comments.) for NumPy to use internally,.There is a C typedef npy_bool in order to use within the code for boolean values. Scipy Docs. When I introduced my pointers with the type

    npy_bool *adj_value_ptr, *mask_value_ptr, *state_value_ptr, *state_out_ptr;
    

    Segmentation fault disappeared, and I succeeded in manipulating and returning a Numpy Array.

    I'm not an expert, but this solved my issue, point out if I'm wrong.

    The part that has changed in the source code is:

    state_out = (PyArrayObject *) PyArray_FromDims(1,dims_new,NPY_BOOL);
    
    npy_bool *adj_value_ptr, *mask_value_ptr, *state_value_ptr, *state_out_ptr;
    npy_intp i,j;
    
    for(i=0;i<dims[0];i++){
        npy_int sum = 0;
        npy_int conn_ctr = 0;
    
            for(j=0;j<dims[1];j++){
    
                adj_value_ptr = (adjacency_arr->data + i*adjacency_arr->strides[0]
                         +j*adjacency_arr->strides[1]);
    
                if (*adj_value_ptr == true){
    
                    mask_value_ptr = (mask_arr->data + i*mask_arr->strides[0]
                    +j*mask_arr->strides[1]);
    
                    state_value_ptr = (state_arr->data + j*state_arr->strides[0]);
    
                    if ( (*(bool *) mask_value_ptr ^ *(bool *)state_value_ptr) ==  true){
                        sum++;
                    }
                    conn_ctr++;
                }
            }
            state_out_ptr = (state_out->data + i*state_out->strides[0]);
            if (conn_ctr < sum*2){
                *state_out_ptr =  true;
            }
            else {
                *state_out_ptr =  false;
            }
    }