Search code examples
c++pythondistutils

Speeding up build process with distutils


I am programming a C++ extension for Python and I am using distutils to compile the project. As the project grows, rebuilding it takes longer and longer. Is there a way to speed up the build process?

I read that parallel builds (as with make -j) are not possible with distutils. Are there any good alternatives to distutils which might be faster?

I also noticed that it's recompiling all object files every time I call python setup.py build, even when I only changed one source file. Should this be the case or might I be doing something wrong here?

In case it helps, here are some of the files which I try to compile: https://gist.github.com/2923577

Thanks!


Solution

    1. Try building with environment variable CC="ccache gcc", that will speed up build significantly when the source has not changed. (strangely, distutils uses CC also for c++ source files). Install the ccache package, of course.

    2. Since you have a single extension which is assembled from multiple compiled object files, you can monkey-patch distutils to compile those in parallel (they are independent) - put this into your setup.py (adjust the N=2 as you wish):

      # monkey-patch for parallel compilation
      def parallelCCompile(self, sources, output_dir=None, macros=None, include_dirs=None, debug=0, extra_preargs=None, extra_postargs=None, depends=None):
          # those lines are copied from distutils.ccompiler.CCompiler directly
          macros, objects, extra_postargs, pp_opts, build = self._setup_compile(output_dir, macros, include_dirs, sources, depends, extra_postargs)
          cc_args = self._get_cc_args(pp_opts, debug, extra_preargs)
          # parallel code
          N=2 # number of parallel compilations
          import multiprocessing.pool
          def _single_compile(obj):
              try: src, ext = build[obj]
              except KeyError: return
              self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
          # convert to list, imap is evaluated on-demand
          list(multiprocessing.pool.ThreadPool(N).imap(_single_compile,objects))
          return objects
      import distutils.ccompiler
      distutils.ccompiler.CCompiler.compile=parallelCCompile
      
    3. For the sake of completeness, if you have multiple extensions, you can use the following solution:

      import os
      import multiprocessing
      try:
          from concurrent.futures import ThreadPoolExecutor as Pool
      except ImportError:
          from multiprocessing.pool import ThreadPool as LegacyPool
      
          # To ensure the with statement works. Required for some older 2.7.x releases
          class Pool(LegacyPool):
              def __enter__(self):
                  return self
      
              def __exit__(self, *args):
                  self.close()
                  self.join()
      
      def build_extensions(self):
          """Function to monkey-patch
          distutils.command.build_ext.build_ext.build_extensions
      
          """
          self.check_extensions_list(self.extensions)
      
          try:
              num_jobs = os.cpu_count()
          except AttributeError:
              num_jobs = multiprocessing.cpu_count()
      
          with Pool(num_jobs) as pool:
              pool.map(self.build_extension, self.extensions)
      
      def compile(
          self, sources, output_dir=None, macros=None, include_dirs=None,
          debug=0, extra_preargs=None, extra_postargs=None, depends=None,
      ):
          """Function to monkey-patch distutils.ccompiler.CCompiler"""
          macros, objects, extra_postargs, pp_opts, build = self._setup_compile(
              output_dir, macros, include_dirs, sources, depends, extra_postargs
          )
          cc_args = self._get_cc_args(pp_opts, debug, extra_preargs)
      
          for obj in objects:
              try:
                  src, ext = build[obj]
              except KeyError:
                  continue
              self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
      
          # Return *all* object filenames, not just the ones we just built.
          return objects
      
      
      from distutils.ccompiler import CCompiler
      from distutils.command.build_ext import build_ext
      build_ext.build_extensions = build_extensions
      CCompiler.compile = compile