Search code examples
pythoncpython-cffi

Python CFFI - Unable to use formatted Python string as byte-array in function call


I'm learning various ways of how to include code written in C to Python because I have an API for a Microchip device which is pretty...tedious to work with and I would like to make my life easier in the future by adding a Python wrapper for it which will allow me to test stuff much faster. One way of doing that is to use the cffi module which even provides its user with verify() that basically calls the C compiler to check if the provided cdef(...) is correct.

I've written a small project so that I can first learn how to properly used cffi. It consists of two parts

  1. Library - written in C. I use cmake and make accordingly to compile its code:

    CMakeLists.txt

    project(testlib_for_cffi)
    cmake_minimum_required(VERSION 2.8)
    
    set(CMAKE_BUILD_TYPE Release)
    set(CMAKE_CXX_FLAGS "-fPIC ${CMAKE_C_FLAGS}")
    # Debug build
    set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG} -Wall -g -O0")
    # Release build
    set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -Os")
    
    aux_source_directory(. SRC_LIST)
    add_library(testcffi SHARED ${SRC_LIST})
    
    # Not required for the library but needed if I want to check for memory leaks with Valgrind
    set(SRC main.c)
    add_executable(${PROJECT_NAME} ${SRC})
    target_link_libraries(${PROJECT_NAME} PUBLIC testcffi)
    

    testcffi.h

    typedef struct
    {
      double x;
      double y;
      double z;
      char *label;
    } point_t;
    
    // Creation, printing and deletion
    point_t* createPoint(double x, double y, double z, char *label);
    void printPoint(point_t *point);
    void deletePoint(point_t *point);
    

    testcffi.c

    #include "testcffi.h"
    #include <stdio.h>
    #include <malloc.h>
    
    point_t* createPoint(double x, double y, double z, char *label) {
      point_t *p = malloc(sizeof(point_t));
      p->x = x;
      p->y = y;
      p->z = z;
      p->label = label;
    
      return p;
    }
    
    void printPoint(point_t *point) {
      if(point == NULL) return;
      printf("Data:\n\tx : %f\n\ty : %f\n\tz : %f\n\tmsg : \"%s\"\n", point->x, point->y, point->z, point->label);
    }
    
    void deletePoint(point_t *point) {
      if(point == NULL) return;
      free(point);
      point = NULL;
    }
    
  2. Test code in Python - the code demonstrates the usage of the struct along with the three functions from the library above:

            #!/usr/bin/python3
    
            from cffi import FFI
            import random
    
            ffi = FFI()
    
            # Add library's header
            ffi.cdef('''
                typedef struct
                {
                  double x;
                  double y;
                  double z;
                  char * label;
                } point_t;
    
                // Creation, printing and deletion
                point_t * createPoint(double x=0., double y=0., double z=0., char *label="my_label");
                void printPoint(point_t *point);
                void deletePoint(point_t *point);
            ''')
    
            # Load shared object from subdirectory `build`
            CLibTC = ffi.dlopen('build/libtestcffi.so')
    
            def createList(length=5):
                if len:
                    lst = []
                    for i in range(0, length):
                        lst.append(CLibTC.createPoint(
                            float(random.random()*(i+1)*10),
                            float(random.random()*(i+1)*10),
                            float(random.random()*(i+1)*10),
                            b'hello'  # FIXME Why does ONLY this work?
                            # ('point_%d' % i).encode('utf-8') # NOT WORKING
                            # 'point_{0}'.format(str(i)).encode('utf-8') # NOT WORKING
                            # ffi.new('char[]', 'point_{0}'.format(str(i)).encode('utf-8')) # NOT WORKING
                        ))
    
                    return lst
                return None
    
    
            def printList(lst):
                if lst and len(lst):
                    for l in lst:
                        CLibTC.printPoint(l)
    
            list_of_dstruct_ptr = createList(10)
            printList(list_of_dstruct_ptr)
    

The problem comes from the byte-array that I have to convert my Python string to in order to pass the data to the respective location in my C code.

The code above is working however I would like to use other then strings similar to b'hello'. That is why I tried to use the format() (along with its short-form %) in Python to combine a bunch of letters and a number but. It didn't work out. I either get "" as a value for the label parameter of my point_t struct or I get a weird alternating garbage data (mostly weird characters that are neither letters nor digits).

I thought that I'm using the encode() function incorrectly however when I tested it inside my Python interactive shell I got the SAME output as using b'...'.

Any idea what's going on here?


A nice-to-know question: From what I've read so far it seems that cffi uses the garbage-collection in Python to deallocated the dynamically allocated memory in your C code. I've tested it with a bunch of points but I would like to make sure this is actually always the case.


Update: Okay, so it seems that things without new(...) do work however in that case all the values are the same as the last one in the loop. For example if the loop goes up to 10, then all struct Python objects will have the 10 in their labels. This seems to be a reference issue. When I use new(...) I get garbage data.


Solution

  • In your C code, the point_t structure holds in label a char *, i.e. a pointer to some other place in memory. If you create 10 point_t structures, they hold pointers to 10 strings that are somewhere else in memory. You have to make sure that these 10 strings are kept alive for as long as you use the point_t structures. CFFI cannot guess that there is such a relationship. When you do the call CLibTC.createPoint(..., some_string), CFFI allocates a char[] array around the call and copies some_string in it, but this char[] memory is freed after the call.

    Use that kind of code instead:

    c_string = ffi.new("char[]", some_string)
    lst.append(createPoint(..., c_string))
    keepalive.append(c_string)
    

    where keepalive is another list, which you must make sure remains alive for as long as you need the point_t to contain valid label's.