Search code examples
pythoncompilationruntimepytorchtheano

Which python deep learning libraries compile at runtime?


I am trying to wrap my head around C-optimized code in python. I have read a couple of times now that python achieves high-speed computing through C-extensions. In other words, whenever I work with libraries such as numpy, it basically calls a C-extension that calculates the result and returns it.

C-extensions using numpy

Say I want to add two numbers using np.add(x,y). If I understand it correctly, libraries such as numpy do not compile the python code but instead already come with executables that will simply take the values x and y and return the result. Is that correct?

Theano, Tensorflow, and PyTorch

In particular, I am wondering if this is also true for deep learning libraries. According to the official documentation of Theano, it requires g++ and gcc (at least they are highly recommended). Does this mean that Theano will compile C (or C++) code at runtime of the python script? If so, is it the same for PyTorch and Tensorflow?

I hope that someone can solve my confusion here! Thanks a lot!


Solution

  • C extensions in python

    numpy uses C-extensions a lot. For instance, you can take a look at the C implementation of the sort() function [1] here [2].

    [1] https://docs.scipy.org/doc/numpy/reference/generated/numpy.sort.html

    [2] https://github.com/numpy/numpy/blob/master/numpy/core/src/npysort/quicksort.c.src

    Deep learning libraries

    Deep learning libraries use C-extensions for a large part of their backend, as well as CUDA and CUDNN. Code can be compiled at runtime:

    [3] http://deeplearning.net/software/theano/extending/pipeline.html#compilation-of-the-computation-graph

    [4] https://www.tensorflow.org/xla/jit

    [5] https://pytorch.org/blog/the-road-to-1_0/#production--pain-for-researchers

    To answer your question, theano will compile C/C++ code at runtime of the python script. The graph compilation time at runtime is extremely slow for theano: I advise you to focus on pytorch or tensorflow rather than theano.

    If you're new to deep learning, you may take a quick look at [6] too.

    [6] https://github.com/google/jax