Search code examples
pythoncythonctypespython-extensions

Differences between Cython, extending C/C++ with Python.h, etc


Right now I have an image processing algorithm that is roughly 100 lines or so in Python. It takes about 500ms using numpy, PIL and scipy. I am looking to get it faster, and as the actual algorithm seems pretty optimized thus far, I'm wondering if using a different approach such as Cython would improve the times. I believe that I have several different things I could do:

  1. Use Cython to expose relevant parts of the C library to Python.
  2. Use Ctypes to just write everything in C but still have it pure Python (not leaning towards this at all)
  3. Create an extension module in C/C++ and them import it and call the functions. I'm not sure if I would be able to use numpy this way though.
  4. Create a DLL and have Python load it. This doesn't get to use numpy or those modules, but would still be very efficient.

I'm just looking for speed here, not worried about difficulty of the implementation. Is there any one option that is better in this case, are they all the same, or is it even worth doing?


Solution

  • It helps to know what you need to do here.

    If you're not using ctypes for function calls, it's unlikely that it will save you anything to just have ctypes types involved. If you already have some DLL/shared object lying around with a "solve it for me" function, then sure, ctypes it is.

    Cython creates extension modules, so anything you can do with Cython could also be done with an extension module, it just depends on how comfortable you are writing extensions by hand. Cython is more limited than writing extensions by hand, and harder to "see" performance in (the rules for optimizing Cython are basically the opposite of optimizing standard Python-level CPython code, and if you forget to cdef the right things, you gain nothing; cdefing the wrong things can make the code slower), but Cython is generally simpler too.

    Writing a separate non-extension DLL is only worthwhile if you have non-Python uses for it; otherwise, a Python extension is basically just the DLL case, but better integrated.

    Basically, by definition, with infinite time and skill, a CPython extension will beat any other option on performance since it can do everything the others do, and more. It's just more work, and easy to make mistakes (because you're writing C, which is error-prone).