I'm trying to access a native (fortran) library (mylib.so
) loaded by JNA. The library is accessed in parallal by a Spark-Job. So far I have not synchronized the method calls (nor the library) as the call to the shared library is the bottleneck in my computations and they must run in parallel.
I get the following Error:
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007ffbcb5f8dcd, pid=58569, tid=140708155152128
#
# JRE version: Java(TM) SE Runtime Environment (8.0_60-b27) (build 1.8.0_60-b27)
*** Error in `/usr/java/jdk1.8.0_60/jre/bin/java': double free or corruption (!prev): 0x0000000001b756d0 ***
*** Error in `/usr/java/jdk1.8.0_60/jre/bin/java': free(): corrupted unsorted chunks: 0x0000000001b75010 ***
*** Error in `/usr/java/jdk1.8.0_60/jre/bin/java': double free or corruption (!prev): 0x0000000001b756d0 ***
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.60-b23 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [libc.so.6+0x38dcd]======= Backtrace: =========
======= Backtrace: =========
======= Backtrace: =========
/lib64/libc.so.6(+0x7d053)[0x7ffbcb63d053]
/lib64/libc.so.6(+0x7d053)[0x7ffbcb63d053]
/lib64/libc.so.6(+0x7d053)[0x7ffbcb63d053]
/lib64/libc.so.6(+0x38e90)[0x7ffbcb5f8e90]
/usr/java/jdk1.8.0_60/jre/lib/amd64/server/libjvm.so(+0x5d43f9)[0x7ffbcabba3f9]
/lib64/libc.so.6(+0x38e90)[0x7ffbcb5f8e90]
/lib64/libc.so.6(+0x38eb5)[0x7ffbcb5f8eb5]
/lib64/libc.so.6(+0x38e69)[0x7ffbcb5f8e69]
/lib64/libc.so.6(+0x38eb5)[0x7ffbcb5f8eb5]
__run_exit_handlers+0x3d
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
/opt/usr/local/lib/mylib.so(for_exit+0x19)[0x7ff9285b0fa9]
/lib64/libc.so.6(+0x38eb5)[0x7ffbcb5f8eb5]
/opt/usr/local/lib/mylib.so(for_exit+0x19)[0x7ff9285b0fa9]
/opt/usr/local/lib/mylib.so(for__once_private+0x27)[0x7ff9285660b7]
/opt/usr/local/lib/mylib.so(for__once_private+0x27)[0x7ff9285660b7]
/opt/usr/local/lib/mylib.so(for__acquire_lun+0x814)[0x7ff92855ea04]
/opt/usr/local/lib/mylib.so(for__acquire_lun+0x814)[0x7ff92855ea04]
/opt/usr/local/lib/mylib.so(for_write_int_fmt+0x9c)[0x7ff92857f83c]
/opt/usr/local/lib/mylib.so(for__acquire_lun+0x814)[0x7ff92855ea04]
/opt/usr/local/lib/mylib.so(for_write_int_fmt+0x9c)[0x7ff92857f83c]
/opt/usr/local/lib/mylib.so(seterr_+0x1c8)[0x7ff928515328]
/opt/usr/local/lib/mylib.so(for_write_int_fmt+0x9c)[0x7ff92857f83c]
/opt/usr/local/lib/mylib.so(seterr_+0x1c8)[0x7ff928515328]
/opt/usr/local/lib/mylib.so(ddl2sf_+0x322)[0x7ff92850eb02]
/opt/usr/local/lib/mylib.so(seterr_+0x1c8)[0x7ff928515328]
/opt/usr/local/lib/mylib.so(entsrc_+0x80)[0x7ff92850edb0]
/opt/usr/local/lib/mylib.so(bspline_+0x52c)[0x7ff92850d66c]
/opt/usr/local/lib/mylib.so(ddl2sf_+0x2e7)[0x7ff92850eac7]
/opt/usr/local/lib/mylib.so(enter_+0x55)[0x7ff92850ecd5]
/opt/usr/local/lib/mylib.so(rspbsp_+0x4a)[0x7ff928523cda]
/opt/usr/local/lib/mylib.so(bspline_+0x52c)[0x7ff92850d66c]
/opt/usr/local/lib/mylib.so(ddl2sf_+0x16d)[0x7ff92850e94d]
/tmp/jna--1845237309/jna4302334124297214663.tmp(ffi_call_unix64+0x4c)[0x7ff92909465c]
/tmp/jna--1845237309/jna4302334124297214663.tmp(ffi_call+0x1d4)[0x7ff929094164]
/opt/usr/local/lib/mylib.so(bspline_+0x52c)[0x7ff92850d66c]
/opt/usr/local/lib/mylib.so(rspbsp_+0x4a)[0x7ff928523cda]
/tmp/jna--1845237309/jna4302334124297214663.tmp(+0x5870)[0x7ff929087870]
/tmp/jna--1845237309/jna4302334124297214663.tmp(Java_com_sun_jna_Native_invokeVoid+0x22)[0x7ff92908a462]
[0x7ffbb5015994]
AFAIK this has to that the native library is not thread-safe? From the stacktracke it seems to me that the actual problem is libc.so
, or is it my own library mylib.so
?
Case that the problem is in my own library, would it be possible to overcome the problem by making multiple physical copies of the shared object, one for each thread for example?
This would generally indicate an issue in your shared library or its use. If you did not design the library to be thread-safe then it is likely not. A single process cannot load a native library multiple times, so there is no easy solution in that way. There are a couple of avenues: