Search code examples
pythonnumpysegmentation-faultctypesicc

Segfault occurs when my shared library is optimized by icc -O3 or -O2 and used via Python ctypes


This behavior is strange because I could not get the segfault

  • if the shared library was compiled without or with weaker optimization (-O0 or -O1)
  • if the shared library was compiled with gcc, even with optimization flag (-O3)
  • if I ran the code from pure C program (not via ctypes)

Furthermore, I could not get the segfault in some machines.

If you find bug in my code that's better, but I have other more general questions:

  1. Can it be icc or ctypes bug? Is it ok to submit the bug to issue tracking system even if I can reproduce the strange behavior in my specific environment?
  2. I tried to debug the code but since this bug is visible only when the code is optimized I got a lot of "xxx is defined but not allocated (optimized away)" when I use the debugger. Is there better way to debug optimized code?

How to reproduce the bug

Suppose I have the library source code strange.c and hte python script run.py, I get the segfault with:

icc -O3 -Wall -shared strange.c -o libstrange.so
python run.py

Note that I could reproduce this bug in one of my machine

  • uname -m: i868
  • OS: Ubuntu 10.04.2 LTS
  • icc: 12.0.0 20101006
  • Python: 2.6.5
  • Numpy: 1.3.0

but not in

  • uname -m: i868
  • OS: Ubuntu 10.10
  • icc: 12.0.3 20110309
  • Python: 2.6.6
  • Numpy: 1.3.0

or

  • uname -m: x86_64
  • OS: Scientific Linux SL release 5.5 (Boron)
  • icc: 12.0.0 20101006
  • Python: 2.6.5
  • Numpy: 1.5.0b1

Code

Please find the set of code here (tkf / ctypes_icc / source – Bitbucket) or below. You can find a Makefile and a shell script to run the program and check the exit code with all optimization flags and the compilers (gcc and icc). The original version of this program is a simulation program for my research, but this program is just a meaningless program.

strange.c:

typedef struct{
  int num_n;
  double dt, ie, gl, isyn, ssyn, tau1, tau2, lmd1, lmd2, k1_mean, k2_mean;
  double *vi, *v0;
} StrangeStruct;


void
func(double * v0, double * vt, double dt,
     double gl, double isyn, double ie, double isyn_estimate, int num_n)
{
  int i;
  for (i = 0; i < num_n; ++i){
    v0[i] = vt[i] + dt + gl + isyn + ie + isyn_estimate;
  }
}

int
StrangeStruct_func(StrangeStruct *self)
{
  double isyn_estimate;
  isyn_estimate =
    self->ssyn * (self->lmd1 * self->k1_mean - self->lmd2 * self->k2_mean) /
    (self->tau1 - self->tau2);
  func(self->v0, self->vi, self->dt, self->gl, self->isyn,
       self->ie, isyn_estimate, self->num_n);
  return 0;
}

run.py:

from ctypes import POINTER, pointer, c_int, c_double, Structure
import numpy

c_double_p = POINTER(c_double)


class StrangeStruct(Structure):
    _fields_ = [
        ("num_n", c_int),
        ("dt", c_double),
        ("ie", c_double),
        ("gl", c_double),
        ("isyn", c_double),
        ("ssyn", c_double),
        ("tau1", c_double),
        ("tau2", c_double),
        ("lmd1", c_double),
        ("lmd2", c_double),
        ("k1_mean", c_double),
        ("k2_mean", c_double),
        ("vi", c_double_p),
        ("v0", c_double_p),
        ]


StrangeStruct_p = POINTER(StrangeStruct)

ifnet_a2a2 = numpy.ctypeslib.load_library('libstrange.so', '.')
ifnet_a2a2.StrangeStruct_func.restype = c_int
ifnet_a2a2.StrangeStruct_func.argtypes = [StrangeStruct_p]


def func(struct):
    ifnet_a2a2.StrangeStruct_func(pointer(struct))


if __name__ == '__main__':
    ifn = StrangeStruct(
        num_n=100, dt=0.1, gl=0.1, vrest=-60, ie=-3.7, th=-40,
        ssyn=0.5, tau1=3, tau2=1,
        )
    v0 = numpy.zeros(ifn.num_n, dtype=float)
    vi = numpy.zeros(ifn.num_n, dtype=float)
    ifn.v0 = v0.ctypes.data_as(c_double_p)
    ifn.vi = vi.ctypes.data_as(c_double_p)

    func(ifn)

    v0 + vi

Solution

  • It is usually not possible to mix binaries compiled with gcc and icc (and in this case the python is built with gcc). You could try using the icc "gcc compatibility" mode, which is set by the -gcc-version flag. That might get it to work, but it is still possible you will have problems.