Search code examples
pythonc++boost-pythonaddress-sanitizer

Address sanitizing Boost.Python modules


My project includes a large C++ library and Python bindings (via Boost.Python). The test suite is mostly written on top of the Python bindings, and I would like to run it with sanitizers, starting with ASAN.

I'm running macOS (10.13.1 FWIW, but I had the problem with previous versions too), and I can't seem to find a way to run ASAN on Python modules (I very much doubt this is related to Boost.Python, I suppose it's the same with other techniques).

Here is a simple Python module:

// hello_ext.cc
#include <boost/python.hpp>

char const* greet()
{
  auto* res = new char[100];
  std::strcpy(res, "Hello, world!");
  delete [] res;
  return res;
}

BOOST_PYTHON_MODULE(hello_ext)
{
  using namespace boost::python;
  def("greet", greet);
}

here is the Makefile I used, made for MacPorts:

// Makefile
CXX = clang++-mp-4.0
CXXFLAGS = -g -std=c++14 -fsanitize=address -fno-omit-frame-pointer
CPPFLAGS = -isystem/opt/local/include $$($(PYTHON_CONFIG) --includes)
LDFLAGS = -L/opt/local/lib
PYTHON = python3.5
PYTHON_CONFIG = python3.5-config
LIBS = -lboost_python3-mt $$($(PYTHON_CONFIG) --ldflags)

all: hello_ext.so

hello_ext.so: hello_ext.cc
        $(CXX) $(CPPFLAGS) $(CXXFLAGS) $(LDFLAGS) -shared -o $@ $< $(LIBS)

check: all
        $(ENV) $(PYTHON) -c 'import hello_ext; print(hello_ext.greet())'

clean:
        -rm -f hello_ext.so

Without asan, everything works well (well, too well actually...). But with ASAN, I hit LD_PRELOAD like issues:

$ make check
python -c 'import hello_ext; print(hello_ext.greet())'
==19013==ERROR: Interceptors are not working. This may be because AddressSanitizer is loaded too late (e.g. via dlopen). Please launch the executable with:
DYLD_INSERT_LIBRARIES=/opt/local/libexec/llvm-4.0/lib/clang/4.0.1/lib/darwin/libclang_rt.asan_osx_dynamic.dylib
"interceptors not installed" && 0make: *** [check] Abort trap: 6

Okay, let's do that: define DYLD_INSERT_LIBRARIES

$ DYLD_INSERT_LIBRARIES=/opt/local/libexec/llvm-4.0/lib/clang/4.0.1/lib/darwin/libclang_rt.asan_osx_dynamic.dylib \
  python -c 'import hello_ext; print(hello_ext.greet())'
==19023==ERROR: Interceptors are not working. This may be because AddressSanitizer is loaded too late (e.g. via dlopen). Please launch the executable with:
DYLD_INSERT_LIBRARIES=/opt/local/libexec/llvm-4.0/lib/clang/4.0.1/lib/darwin/libclang_rt.asan_osx_dynamic.dylib
"interceptors not installed" && 0zsh: abort      DYLD_INSERT_LIBRARIES= python -c 'import hello_ext; print(hello_ext.greet())'

Let's be suspicious about SIP, so I have disabled SIP here, and let's resolve the symlinks:

$ DYLD_INSERT_LIBRARIES=/opt/local/libexec/llvm-4.0/lib/clang/4.0.1/lib/darwin/libclang_rt.asan_osx_dynamic.dylib \
  /opt/local/Library/Frameworks/Python.framework/Versions/3.5/bin/python3.5 -c 'import hello_ext; print(hello_ext.greet())'
==19026==ERROR: Interceptors are not working. This may be because AddressSanitizer is loaded too late (e.g. via dlopen). Please launch the executable with:
DYLD_INSERT_LIBRARIES=/opt/local/libexec/llvm-4.0/lib/clang/4.0.1/lib/darwin/libclang_rt.asan_osx_dynamic.dylib
"interceptors not installed" && 0zsh: abort      DYLD_INSERT_LIBRARIES=  -c 'import hello_ext; print(hello_ext.greet())'

What's the right way to do that? I have also tried to load libasan with ctypes.PyDLL, and even with sys.setdlopenflags(os.RTLD_NOW | os.RTLD_GLOBAL) I can't get this to work.


Solution

  • So, I finally managed to get this to work:

    $ libasan=/opt/local/libexec/llvm-4.0/lib/clang/4.0.1/lib/darwin/libclang_rt.asan_osx_dynamic.dylib
    $ python=/opt/local/Library/Frameworks/Python.framework/Versions/3.5/Resources/Python.app/Contents/MacOS/Python
    $ DYLD_INSERT_LIBRARIES=$libasan $python -c 'import hello_ext; print(hello_ext.greet())'
    =================================================================
    ==70859==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b000002770 at pc 0x000108c2ef60 bp 0x7ffee6fe8c20 sp 0x7ffee6fe83c8
    READ of size 2 at 0x60b000002770 thread T0
        #0 0x108c2ef5f in wrap_strlen (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x14f5f)
        #1 0x109a8d939 in PyUnicode_FromString (Python:x86_64+0x58939)
    [...]
    

    What changed? Nothing in the compilation chain, just the invocation.

    Let PYDIR=/opt/local/Library/Frameworks/Python.framework/Versions/3.5: previously I was calling $PYDIR/bin/python3.5 (because /opt/local/bin/python3.5 is a symlink to it), now I call $PYDIR/Resources/Python.app/Contents/MacOS/Python.

    To understand what was going on, I ran DYLD_INSERT_LIBRARIES=$libasan python3.5, and looked for its open files

    $ ps
      PID TTY           TIME CMD
      900 ttys000    0:07.96 -zsh
    70897 ttys000    0:00.11 /opt/local/Library/Frameworks/Python.framework/Versions/3.5/Resources/Python.app/Contents/MacOS/Python
    53528 ttys001    0:05.14 -zsh
      920 ttys002    0:10.28 -zsh
    $ lsof -p 70897
    COMMAND   PID USER   FD   TYPE DEVICE   SIZE/OFF       NODE NAME
    Python  70897 akim  cwd    DIR    1,4        480 8605949500 /Users/akim/src/lrde/vcsn/experiment/sanitizer
    Python  70897 akim  txt    REG    1,4      12988 8591019542 /opt/local/Library/Frameworks/Python.framework/Versions/3.5/Resources/Python.app/Contents/MacOS/Python
    Python  70897 akim  txt    REG    1,4    2643240 8591012758 /opt/local/Library/Frameworks/Python.framework/Versions/3.5/Python
    Python  70897 akim  txt    REG    1,4     107524 8590943656 /opt/local/lib/libintl.8.dylib
    Python  70897 akim  txt    REG    1,4    2097528 8590888556 /opt/local/lib/libiconv.2.dylib
    Python  70897 akim  txt    REG    1,4      20224 8591016920 /opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/lib-dynload/_heapq.cpython-35m-darwin.so
    Python  70897 akim  txt    REG    1,4     326996 8591375651 /opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/readline/gnureadline.cpython-35m-darwin.so
    Python  70897 akim  txt    REG    1,4     603008 8605907803 /opt/local/lib/libncurses.6.dylib
    Python  70897 akim  txt    REG    1,4     837248 8606849556 /usr/lib/dyld
    Python  70897 akim  txt    REG    1,4 1155837952 8606860187 /private/var/db/dyld/dyld_shared_cache_x86_64h
    Python  70897 akim    0u   CHR   16,0  0t2756038        667 /dev/ttys000
    Python  70897 akim    1u   CHR   16,0  0t2756038        667 /dev/ttys000
    Python  70897 akim    2u   CHR   16,0  0t2756038        667 /dev/ttys000
    

    Obviously libasan is not here, and that's the whole problem. However, I also noticed that ps was referring to another Python than the one I ran (and, of course, it's part of the open files).

    It turns out that there are several Python executables in this directory: $PYDIR/bin/python3.5 and $PYDIR/Resources/Python.app/Contents/MacOS/Python, and the first one, in one way or another, bounces to the second. If I run the second with DYLD_INSERT_LIBRARIES=$libasan

    $ lsof -p 71114
    COMMAND   PID USER   FD   TYPE DEVICE  SIZE/OFF       NODE NAME
    Python  71114 akim  cwd    DIR    1,4       480 8605949500 /Users/akim/src/lrde/vcsn/experiment/sanitizer
    Python  71114 akim  txt    REG    1,4     12988 8591019542 /opt/local/Library/Frameworks/Python.framework/Versions/3.5/Resources/Python.app/Contents/MacOS/Python
    Python  71114 akim  txt    REG    1,4   3013168 8604479549 /opt/local/libexec/llvm-4.0/lib/clang/4.0.1/lib/darwin/libclang_rt.asan_osx_dynamic.dylib
    Python  71114 akim  txt    REG    1,4   2643240 8591012758 /opt/local/Library/Frameworks/Python.framework/Versions/3.5/Python
    Python  71114 akim  txt    REG    1,4    107524 8590943656 /opt/local/lib/libintl.8.dylib
    Python  71114 akim  txt    REG    1,4   2097528 8590888556 /opt/local/lib/libiconv.2.dylib
    Python  71114 akim  txt    REG    1,4     20224 8591016920 /opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/lib-dynload/_heapq.cpython-35m-darwin.so
    Python  71114 akim  txt    REG    1,4    326996 8591375651 /opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/readline/gnureadline.cpython-35m-darwin.so
    Python  71114 akim  txt    REG    1,4    603008 8605907803 /opt/local/lib/libncurses.6.dylib
    Python  71114 akim  txt    REG    1,4    837248 8606849556 /usr/lib/dyld
    Python  71114 akim    0u   CHR   16,0 0t2781894        667 /dev/ttys000
    Python  71114 akim    1u   CHR   16,0 0t2781894        667 /dev/ttys000
    Python  71114 akim    2u   CHR   16,0 0t2781894        667 /dev/ttys000
    

    \o/ libasan is there! So apparently, the first Python calls the second one, and DYLD_INSERT_LIBRARIES is not forwarded.

    I currently have no idea why there are two Pythons. It does not seem to be specific to MacPorts, as it's also the case for Apple's Python.

    $ cd /System/Library/Frameworks/Python.framework/Versions/2.7
    $ ls -l bin/python*
    lrwxr-xr-x  1 root  wheel      7 10 déc 08:17 bin/python -> python2
    lrwxr-xr-x  1 root  wheel     14 10 déc 08:17 bin/python-config -> python2-config
    lrwxr-xr-x  1 root  wheel      9 10 déc 08:17 bin/python2 -> python2.7
    lrwxr-xr-x  1 root  wheel     16 10 déc 08:17 bin/python2-config -> python2.7-config
    -rwxr-xr-x  1 root  wheel  43104  1 déc 21:42 bin/python2.7
    -rwxr-xr-x  1 root  wheel   1818 16 jul 02:20 bin/python2.7-config
    lrwxr-xr-x  1 root  wheel      8 10 déc 08:17 bin/pythonw -> pythonw2
    lrwxr-xr-x  1 root  wheel     10 10 déc 08:17 bin/pythonw2 -> pythonw2.7
    -rwxr-xr-x  1 root  wheel  43104  1 déc 21:42 bin/pythonw2.7
    $ ls -l Resources/Python.app/Contents/MacOS/Python
    -rwxr-xr-x  1 root  wheel  51744  1 déc 21:48 Resources/Python.app/Contents/MacOS/Python