I am trying to convert a python list of lists to a cython multidimensional array. The list has 300,000 elements each element is a list of 10 integers. For this case here created randomly. The way I tried works fine as long as my cython multidimensional array is not bigger then somewhere about [210000][10]. My actual project of course is more complex but I believe if I get this example here to work, the rest is just more of the same.
I have a cython file "array_cy.pyx" with the following content:
cpdef doublearray(list list1):
cdef int[200000][10] a
cdef int i
cdef int y
cdef int j
cdef int value = 0
for i in range(200000):
for y in range(10):
a[i][y] = list1[i][y]
print("doublearray")
print(a[40000][6])
cpdef doublearray1(list list1):
cdef int[300000][10] a
cdef int i
cdef int y
cdef int value = 0
for i in range(300000):
for y in range(10):
a[i][y] = list1[i][y]
print("doublearray1")
print(a[40000][6])
Then in the main.py I have
import array_cy
import random
list1 = []
for i in range(300000):
list2 = []
for j in range(10):
list2.append(random.randint(0, 22))
list1.append(list2)
array_cy.doublearray(list1)
array_cy.doublearray1(list1)
And the output is:
doublearray
4
Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)
So the function doublearray(list) works fine and the output is some random number as expected. But doublearray1(list) gives SIGSEGV. If in doublearray1(list) I comment out the line
print(a[40000][6])
it also runs through witout a problem, which makes sense because I never try to access the array. I dont understand why it does not work. I thought in C the limit of elements in an array would be defined by the hardware. My goal is to convert the python list of lists in a way to a cython multidimensional array, that I can access without any python interaction.
The suggested question is about using malloc I think that is what I need but I still dont get it to work because if I change the two functions to:
cpdef doublearray(list list1):
cdef int[200000][10] a = <int**> malloc(200000 * 10 * sizeof(int))
cdef int i
cdef int y
cdef int j
cdef int value = 0
for i in range(200000):
for y in range(10):
a[i][y] = list1[i][y]
print("doublearray")
print(a[40000][6])
cpdef doublearray1(list list1):
cdef int[300000][10] a = <int**> malloc(300000 * 10 * sizeof(int))
cdef int i
cdef int y
cdef int value = 0
for i in range(300000):
for y in range(10):
a[i][y] = list1[i][y]
print("doublearray1")
print(a[40000][6])
still only the smaller array works.
The way to do that in C is that you transform the list of lists with length 10 into a 1D-Array. And Using malloc to allocate enough space and freeing it afterwards. Another way is to use an array of pointers.
cpdef doublearray1(list list1):
cdef int *a = <int *> malloc(3000000*sizeof(int))
cdef int i
cdef int y
cdef int value = 0
for i in range(300000):
for y in range(10):
a[i*10+y] = list1[i][y]
print("doublearray1")
# same as a[2][5] in 2D-Array
print(a[25])