Search code examples
multidimensional-arraycython

How do I create a Multidimensional cython array of fixed size?


I am trying to convert a python list of lists to a cython multidimensional array. The list has 300,000 elements each element is a list of 10 integers. For this case here created randomly. The way I tried works fine as long as my cython multidimensional array is not bigger then somewhere about [210000][10]. My actual project of course is more complex but I believe if I get this example here to work, the rest is just more of the same.

I have a cython file "array_cy.pyx" with the following content:

cpdef doublearray(list list1):
    cdef int[200000][10] a
    cdef int i
    cdef int y
    cdef int j
    cdef int value = 0
    for i in range(200000):
        for y in range(10):
            a[i][y] = list1[i][y]
    print("doublearray")
    print(a[40000][6])

cpdef doublearray1(list list1):
    cdef int[300000][10] a
    cdef int i
    cdef int y
    cdef int value = 0
    for i in range(300000):
        for y in range(10):
            a[i][y] = list1[i][y]
    print("doublearray1")
    print(a[40000][6])

Then in the main.py I have

    import array_cy
    import random


    list1 = []
    for i in range(300000):
        list2 = []
        for j in range(10):
            list2.append(random.randint(0, 22))
        list1.append(list2)
    array_cy.doublearray(list1)
    array_cy.doublearray1(list1)

And the output is:

doublearray
4

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

So the function doublearray(list) works fine and the output is some random number as expected. But doublearray1(list) gives SIGSEGV. If in doublearray1(list) I comment out the line

    print(a[40000][6])

it also runs through witout a problem, which makes sense because I never try to access the array. I dont understand why it does not work. I thought in C the limit of elements in an array would be defined by the hardware. My goal is to convert the python list of lists in a way to a cython multidimensional array, that I can access without any python interaction.

The suggested question is about using malloc I think that is what I need but I still dont get it to work because if I change the two functions to:

cpdef doublearray(list list1):
    cdef int[200000][10] a = <int**> malloc(200000 * 10 * sizeof(int))
    cdef int i
    cdef int y
    cdef int j
    cdef int value = 0
    for i in range(200000):
        for y in range(10):
            a[i][y] = list1[i][y]
    print("doublearray")
    print(a[40000][6])

cpdef doublearray1(list list1):
    cdef int[300000][10] a = <int**> malloc(300000 * 10 * sizeof(int))
    cdef int i
    cdef int y
    cdef int value = 0
    for i in range(300000):
        for y in range(10):
            a[i][y] = list1[i][y]
    print("doublearray1")
    print(a[40000][6])


still only the smaller array works.


Solution

  • The way to do that in C is that you transform the list of lists with length 10 into a 1D-Array. And Using malloc to allocate enough space and freeing it afterwards. Another way is to use an array of pointers.

    cpdef doublearray1(list list1):
        cdef int *a = <int *> malloc(3000000*sizeof(int))
        cdef int i
        cdef int y
        cdef int value = 0
        for i in range(300000):
            for y in range(10):
                a[i*10+y] = list1[i][y]
        print("doublearray1")
        # same as a[2][5] in 2D-Array
        print(a[25])