Search code examples
javaignite

A fatal error has been detected by the Java Runtime Environment when ignite native persistence is on


I try to put Apache Arrow vector in Ignite, this is working fine when I turn off native persistence, but after I turn on native persistence, JVM is crashed every time. I create IntVector first then put it in Ignite:

RootAllocator allocator = new RootAllocator(Long.MAX_VALUE);
IntVector intVector = new IntVector("int", allocator);
intVector.setSafe(0, 1);
igniteCache.put("key", intVector);

I can get "key" at first run, after I turn on native persistence and comment out above codes, JVM is crashed when calling

IntVector intVector = (IntVector) igniteCache.get("key");    //CRASHED HERE!!!
intVector.get(0);              

with following error(only some of them):

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f778362098d, pid=30450, tid=0x00007f7782a28700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_271-b09) (build 1.8.0_271-b09)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.271-b09 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0xaa898d]  Unsafe_GetNativeByte+0xad
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
---------------  T H R E A D  ---------------

Current thread (0x00007f777c00c800):  JavaThread "main" [_thread_in_vm, id=30451, stack(0x00007f7782929000,0x00007f7782a29000)]

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x00007f96825662d8

Registers:
RAX=0x00007f777c00c800, RBX=0x00007f777c00c800, RCX=0x00000005cc456068, RDX=0x00007f7783f9f340
RSP=0x00007f7782a27760, RBP=0x00007f7782a27790, RSI=0x0000000000000001, RDI=0x0000000000000010
R8 =0x00007f777c00c800, R9 =0x0000000000000004, R10=0x00007f776d6f2cb8, R11=0x00007f776d6f2c98
R12=0x00007f96825662d8, R13=0x00007f7782a277f8, R14=0x00007f7782a27870, R15=0x00007f7783b4e2ec
RIP=0x00007f778362098d, EFLAGS=0x0000000000010246, CSGSFS=0x002b000000000033, ERR=0x0000000000000004
  TRAPNO=0x000000000000000e

Top of Stack: (sp=0x00007f7782a27760)
0x00007f7782a27760:   00000007c000e158 00007f776c47b130
0x00007f7782a27770:   0000000000000000 00007f7782a277f8
0x00007f7782a27780:   00007f7782a27870 00007f777c00c800
0x00007f7782a27790:   00007f7782a277e0 00007f776d6f2d28
0x00007f7782a277a0:   00000005cc456068 00007f747f23c1e0
0x00007f7782a277b0:   00007f7782a27800 00000000000000b6
0x00007f7782a277c0:   0000000000000000 00007f747f23d03e
0x00007f7782a277d0:   00007f7782a27870 00007f777c00c800
0x00007f7782a277e0:   00007f7782a27850 00007f776d0079c0
0x00007f7782a277f0:   00007f776d0079c0 00007f96825662d8
0x00007f7782a27800:   000000071e23bbd0 00000005cc456068
0x00007f7782a27810:   00007f7782a27810 00007f747f23d03e
0x00007f7782a27820:   00007f7782a27870 00007f747f2ec4e0
0x00007f7782a27830:   0000000000000000 00007f747f23d068
0x00007f7782a27840:   00007f7782a277f8 00007f7782a27860
0x00007f7782a27850:   00007f7782a278b8 00007f776d0079c0
0x00007f7782a27860:   0000000000000000 00007f776d023f9f
0x00007f7782a27870:   000000071e23bbd0 00007f7782a27878
0x00007f7782a27880:   00007f74807265aa 00007f7782a278e8
0x00007f7782a27890:   00007f748072d228 0000000000000000
0x00007f7782a278a0:   00007f7480726608 00007f7782a27860
0x00007f7782a278b0:   00007f7782a278e0 00007f7782a27930
0x00007f7782a278c0:   00007f776d007d00 0000000000000000
0x00007f7782a278d0:   0000000000000000 0000000000000000
0x00007f7782a278e0:   0000000000000000 000000071b339de8
0x00007f7782a278f0:   00007f7782a278f0 00007f748071d730
0x00007f7782a27900:   00007f7782a27948 00007f747f239028
0x00007f7782a27910:   0000000000000000 00007f748071d778
0x00007f7782a27920:   00007f7782a278e0 00007f7782a27940
0x00007f7782a27930:   00007f7782a27998 00007f776d007d00
0x00007f7782a27940:   0000000000000000 000000071b339de8
0x00007f7782a27950:   00000005cc55ec98 00007f7782a27958 

Instructions: (pc=0x00007f778362098d)
0x00007f778362096d:   ff 48 8d 05 73 6f 4f 00 c7 83 70 02 00 00 06 00
0x00007f778362097d:   00 00 8b 38 e8 12 61 75 ff c6 80 94 02 00 00 01
0x00007f778362098d:   45 0f b6 34 24 c6 80 94 02 00 00 00 4c 8b 63 48
0x00007f778362099d:   49 8b 44 24 10 4d 8b 6c 24 08 48 83 38 00 74 1c 

Register to memory mapping:

RAX=0x00007f777c00c800 is a thread
RBX=0x00007f777c00c800 is a thread
RCX=0x00000005cc456068 is an oop
sun.misc.Unsafe 
 - klass: 'sun/misc/Unsafe'
RDX=0x00007f7783f9f340: <offset 0x1e340> in /lib/x86_64-linux-gnu/libpthread.so.0 at 0x00007f7783f81000
RSP=0x00007f7782a27760 is pointing into the stack for thread: 0x00007f777c00c800
RBP=0x00007f7782a27790 is pointing into the stack for thread: 0x00007f777c00c800
RSI=0x0000000000000001 is an unknown value
RDI=0x0000000000000010 is an unknown value
R8 =0x00007f777c00c800 is a thread
R9 =0x0000000000000004 is an unknown value
R10=0x00007f776d6f2cb8 is at entry_point+56 in (nmethod*)0x00007f776d6f2b10
R11=0x00007f776d6f2c98 is at entry_point+24 in (nmethod*)0x00007f776d6f2b10
R12=0x00007f96825662d8 is an unknown value
R13=0x00007f7782a277f8 is pointing into the stack for thread: 0x00007f777c00c800
R14=0x00007f7782a27870 is pointing into the stack for thread: 0x00007f777c00c800
R15=0x00007f7783b4e2ec: <offset 0xfd62ec> in /usr/lib/jvm/jdk1.8.0_271/jre/lib/amd64/server/libjvm.so at 0x00007f7782b78000


Stack: [0x00007f7782929000,0x00007f7782a29000],  sp=0x00007f7782a27760,  free space=1017k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xaa898d]  Unsafe_GetNativeByte+0xad
J 2166  sun.misc.Unsafe.getByte(J)B (0 bytes) @ 0x00007f776d6f2d28 [0x00007f776d6f2c80+0xa8]
j  org.apache.arrow.memory.ArrowBuf.getByte(J)B+14
j  org.apache.arrow.vector.BaseFixedWidthVector.isSet(I)I+10
j  org.apache.arrow.vector.IntVector.get(I)I+8
j  org.apache.ignite.examples.test.MainApplication.main([Ljava/lang/String;)V+99
v  ~StubRoutines::call_stub
V  [libjvm.so+0x68c2ba]  JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0xe1a
V  [libjvm.so+0x6d8700]  jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .isra.98] [clone .constprop.118]+0x1f0
V  [libjvm.so+0x6da9fb]  jni_CallStaticVoidMethod+0x15b
C  [libjli.so+0x889c]  JavaMain+0xa3c
C  [libpthread.so.0+0x9609]  start_thread+0xd9

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 2166  sun.misc.Unsafe.getByte(J)B (0 bytes) @ 0x00007f776d6f2cb8 [0x00007f776d6f2c80+0x38]
j  org.apache.arrow.memory.ArrowBuf.getByte(J)B+14
j  org.apache.arrow.vector.BaseFixedWidthVector.isSet(I)I+10
j  org.apache.arrow.vector.IntVector.get(I)I+8
j  org.apache.ignite.examples.test.MainApplication.main([Ljava/lang/String;)V+99
v  ~StubRoutines::call_stub
Java Threads: ( => current thread )
  0x00007f772c097000 JavaThread "mgmt-#73" [_thread_blocked, id=30583, stack(0x00007f747f439000,0x00007f747f53a000)]
  0x00007f76b000d000 JavaThread "checkpoint-runner-#72" [_thread_blocked, id=30582, stack(0x00007f747f53a000,0x00007f747f63b000)]
  0x00007f76b000b000 JavaThread "checkpoint-runner-#71" [_thread_blocked, id=30581, stack(0x00007f747f63b000,0x00007f747f73c000)]
  0x00007f76b0009000 JavaThread "checkpoint-runner-#70" [_thread_blocked, id=30580, stack(0x00007f747f73c000,0x00007f747f83d000)]
  0x00007f76b0006800 JavaThread "checkpoint-runner-#69" [_thread_blocked, id=30579, stack(0x00007f747f83d000,0x00007f747f93e000)]
  0x00007f772c072800 JavaThread "db-checkpoint-thread-#68" [_thread_in_native, id=30578, stack(0x00007f747f93e000,0x00007f747fa3f000)]

....

GC Heap History (10 events):
Event: 3.808 GC heap before
{Heap before GC invocations=7 (full 2):
 PSYoungGen      total 383488K, used 365568K [0x0000000719700000, 0x000000073b300000, 0x00000007c0000000)
  eden space 365568K, 100% used [0x0000000719700000,0x000000072fc00000,0x000000072fc00000)
  from space 17920K, 0% used [0x000000073a180000,0x000000073a180000,0x000000073b300000)
  to   space 20992K, 0% used [0x0000000738a00000,0x0000000738a00000,0x0000000739e80000)
 ParOldGen       total 338944K, used 24142K [0x00000005cc400000, 0x00000005e0f00000, 0x0000000719700000)
  object space 338944K, 7% used [0x00000005cc400000,0x00000005cdb939b8,0x00000005e0f00000)
 Metaspace       used 41397K, capacity 43002K, committed 43304K, reserved 1087488K
  class space    used 4722K, capacity 5078K, committed 5160K, reserved 1048576K
Event: 3.822 GC heap after
Heap after GC invocations=7 (full 2):
 PSYoungGen      total 523776K, used 20987K [0x0000000719700000, 0x000000073c000000, 0x00000007c0000000)
  eden space 502784K, 0% used [0x0000000719700000,0x0000000719700000,0x0000000738200000)
  from space 20992K, 99% used [0x0000000738a00000,0x0000000739e7ee08,0x0000000739e80000)
  to   space 31744K, 0% used [0x000000073a100000,0x000000073a100000,0x000000073c000000)
 ParOldGen       total 338944K, used 32466K [0x00000005cc400000, 0x00000005e0f00000, 0x0000000719700000)
  object space 338944K, 9% used [0x00000005cc400000,0x00000005ce3b4800,0x00000005e0f00000)
 Metaspace       used 41397K, capacity 43002K, committed 43304K, reserved 1087488K
  class space    used 4722K, capacity 5078K, committed 5160K, reserved 1048576K
}

Deoptimization events (10 events):
Event: 5.314 Thread 0x00007f777c00c800 Uncommon trap: reason=class_check action=maybe_recompile pc=0x00007f776d7db69c method=org.apache.ignite.internal.util.StripedCompositeReadWriteLock.curIdx()I @ 5
Event: 5.315 Thread 0x00007f777c00c800 Uncommon trap: reason=unstable_if action=reinterpret pc=0x00007f776ddc6474 method=jdk.internal.org.objectweb.asm.ByteVector.putInt(I)Ljdk/internal/org/objectweb/asm/ByteVector; @ 13
Event: 5.315 Thread 0x00007f777e4c7000 Uncommon trap: reason=unstable_if action=reinterpret pc=0x00007f776d9d9da8 method=org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(Ljava/lang/Object;Ljava/lang/Throwable;Z)Z @ 40
Event: 5.316 Thread 0x00007f777c00c800 Uncommon trap: reason=class_check action=maybe_recompile pc=0x00007f776d7db69c method=org.apache.ignite.internal.util.StripedCompositeReadWriteLock.curIdx()I @ 5
Event: 5.316 Thread 0x00007f777c00c800 Uncommon trap: reason=class_check action=maybe_recompile pc=0x00007f776d7db69c method=org.apache.ignite.internal.util.StripedCompositeReadWriteLock.curIdx()I @ 5
Event: 5.317 Thread 0x00007f777c00c800 Uncommon trap: reason=class_check action=maybe_recompile pc=0x00007f776d718734 method=java.net.URLClassLoader.defineClass(Ljava/lang/String;Lsun/misc/Resource;)Ljava/lang/Class; @ 13
Event: 5.322 Thread 0x00007f777c00c800 Uncommon trap: reason=class_check action=maybe_recompile pc=0x00007f776d7db69c method=org.apache.ignite.internal.util.StripedCompositeReadWriteLock.curIdx()I @ 5
Event: 5.324 Thread 0x00007f777e4c7000 Uncommon trap: reason=unstable_if action=reinterpret pc=0x00007f776d412e68 method=org.apache.ignite.internal.util.future.GridCompoundFuture.futuresCountNoLock()I @ 25
Event: 5.325 Thread 0x00007f777c00c800 Uncommon trap: reason=class_check action=maybe_recompile pc=0x00007f776d7dbadc method=org.apache.ignite.internal.util.StripedCompositeReadWriteLock.curIdx()I @ 5
Event: 5.330 Thread 0x00007f777c00c800 Uncommon trap: reason=class_check action=maybe_recompile pc=0x00007f776dd30be0 method=org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(IJLorg/apache/ignite/internal/metric/IoStatisticsHolder;ZLjava/util/concurrent

Classes redefined (0 events):
No events
....

It's been torturing me for days, any hint would be helpful, thanks

simple code to reproduce this problem

public class HelloWorld {
    public static void main(String[] args) {
        try (Ignite ignite = Ignition.start("example-ignite.xml")) {
            IgniteCache cache = ignite.getOrCreateCache("myCache");
            //put string in cache at 1st run then comment it out at 2nd run to test native persistence
            cache.put(0, "222"); //will not crash if I comment out this line at 2nd run.
            System.out.println(cache.get(0));

            //put arrow vector in cache at 1st run then comment it out at 2nd run to test native persistence
            RootAllocator allocator = new RootAllocator();
            IntVector vector = new IntVector("int", allocator);
            vector.allocateNew();
            vector.setSafe(0, 111);
            vector.setValueCount(1);
            cache.put(1, vector);   //crashed if I comment out this line at 2nd run.
            System.out.println(cache.get(1));
        }
    }
}

Native persistence setting:

<property name="dataStorageConfiguration">
            <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
                <!--                <property name="walSegmentSize" value="#{128 * 1024 * 1024}"/>-->
                <property name="defaultDataRegionConfiguration">
                    <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
                        <property name="persistenceEnabled" value="true"/>
                    </bean>
                </property>
            </bean>
        </property>

Dependency

<properties>
        <ignite.version>2.9.1</ignite.version>
    </properties>
    <dependencies>
        <dependency>
            <groupId>org.apache.ignite</groupId>
            <artifactId>ignite-core</artifactId>
            <version>${ignite.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.ignite</groupId>
            <artifactId>ignite-spring</artifactId>
            <version>${ignite.version}</version>
        </dependency>
        <dependency>
            <groupId>io.netty</groupId>
            <artifactId>netty-common</artifactId>
            <version>4.1.27.Final</version>
        </dependency>
        <dependency>
            <groupId>org.apache.arrow</groupId>
            <artifactId>arrow-vector</artifactId>
            <version>4.0.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.arrow.gandiva</groupId>
            <artifactId>arrow-gandiva</artifactId>
            <version>4.0.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.arrow</groupId>
            <artifactId>arrow-memory-netty</artifactId>
            <version>4.0.0</version>
            <scope>runtime</scope>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>
        </plugins>
    </build>

Solution

  • Apache Arrow utilizes a pretty similar idea of Java off-heap storage as Apache Ignite does. For Apache Arrow it means that objects like IntVector don't actually store data in their on-heap layout. They just store a reference to a buffer containing an off-heap address of a physical representation. Technically it's a long offset pointing to a chunk of memory within JVM address space.

    When you restart your JVM, address space changes. But in your Apache Ignite native persistence there's a record holding an old pointer. It leads to a SIGSEGV because it's not in the JVM address anymore (in fact it doesn't even exist after a restart).

    You could use Apache Arrow serialization machinery to store data permanently in Apache Ignite or even somewhere else. But in fact after that you're going to lose Apache Arrow preciousness as a fast in-memory columnar store. It was initially designed to share off-heap data across multiple data-processing solutions.

    Therefore I believe that technically it could be possible to leverage Apache Ignite binary storage format. In that case a custom BinarySerializer should be implemented. After that it would be possible to use it with the Apache Arrow vector classes.

    <property name="binaryConfiguration">
        <bean class="org.apache.ignite.configuration.BinaryConfiguration">
            <property name="typeConfigurations">
                <list>
                    <bean class="org.apache.ignite.binary.BinaryTypeConfiguration">
                        <property name="typeName" value="org.apache.arrow.vector.*"/>
                        <property name="serializer" ref="customSerializer"/>
                    </bean>
                </list>
            </property>
        </bean>
    </property>