I'm learning the basics of Cython 3.0 and I'm trying to understand why one of my approaches to dereferencing a pointer (to multiple memory addresses) in order to initialize the values in those addresses doesn't work while the other approaches do work.
Here is my test code
import numpy as np
from libc.stdlib cimport malloc
np_arr = np.array([1.0, 2.0])
cdef double[2] c_arr1 = [1.0, 2.0] # this works
cdef double[2] c_arr2 = np_arr.tolist() # this works
cdef double *c_arr3 = <double *>malloc(2*sizeof(double))
c_arr3[:] = [1.0, 2.0] # this works
cdef double *c_arr4 = <double *>malloc(2*sizeof(double))
c_arr4[:] = np_arr.tolist() # this doesn't work; gives two compile errors
cdef double *c_arr5 = <double *>malloc(2*sizeof(double))
c_arr5[0] = np_arr.tolist()[0]
c_arr5[1] = np_arr.tolist()[1] # this works
The compile errors for c_arr4
are: "Cannot convert Python object to 'double *'" and "Storing unsafe C derivative of temporary Python reference".
It seems that using [:]
on a pointer doesn't actually dereference all of the addresses, yet c_arr3
managed to initialize with it. So my questions are:
Why does c_arr3[:] = [1.0, 2.0]
work? What is the [:]
doing exactly?
If c_arr3[:]
and c_arr4[:]
are considered as double *
objects to Cython, why is np_arr.tolist()
considered a Python object but not [1.0, 2.0]
?
Is there another (ideally more efficient) way of changing the values at the addresses of a pointer without manually looping through each address like I did for c_arr5
?
cdef double *c_arr3 = <double *>malloc(2*sizeof(double))
c_arr3[:] = [1.0, 2.0]
Here, the python list [1.0, 2.0]
is only meant to initialise the block of memory on the heap your pointer c_arr3
points to. Therefore, there's no need to create a python list from a performance perspective. That's one of the reasons why Cython will roughly generate the following C code instead:
/* allocate the memory */
double* c_arr3 = (double*) malloc(2*sizeof(double));
/* create a C array (instead of a list) to initialize c_arr3 */
double tmp[2];
tmp[0] = 1.0;
tmp[1] = 2.0;
/* copy all values of tmp into c_arr3 */
memcpy(&c_arr3[0], &tmp[0], 2*sizeof(double));
The a[:] = expr
is Cython's slicing syntax that is typically used for typed memoryviews and works pretty similar to numpy arrays. Thus, it's mainly used to copy values of the right-hand-side expr
into a
.
c_arr3[:]
and c_arr4[:]
are no double*
. It's just syntactic sugar in Cython for copying values (see the above C code), which also means that it can be used to initialize blocks of memory. However, at the time of writing, initialization via c_arr3[:] = expr
only works for simple right-hand side literals like python lists.
Yes, you could use a typed memoryview instead of raw pointers:
cdef double[::1] c_mv = <double[:2]>malloc(2*sizeof(double))
# set all values to 1.0
c_mv[:] = 1.0
# initialize it with the values of np_arr
# note that you don't need to convert np_arr to a list
c_mv[:] = np_arr[:]
# don't forget to free the memory!
free(&c_mv[0])