Search code examples
ignite

Apache Ignite: Reached logical end of the segment for file


I have enabled Ignite native persistence and disabled WAL log:

<property name="dataStorageConfiguration">
    <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
        <property name="defaultDataRegionConfiguration">
            <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
                <property name="persistenceEnabled" value="true"/>
            </bean>
        </property>
        <!-- disabled wal log because query result doesn't need recovery -->
        <property name="walMode" value="NONE"/>
    </bean>
</property>
<property name="cacheConfiguration">
    <bean class="org.apache.ignite.configuration.CacheConfiguration">
        <!-- Set the cache name. -->
        <property name="name" value="query_cache"/>
        <!-- Set the cache mode. -->
        <property name="cacheMode" value="PARTITIONED"/>
    </bean>
</property>

I start server by application and operate cache in another class:

public class IgniteServer {
    public static void main(String[] args) {
        Ignite ignite = Ignition.start("examples/config/ignite-server-config.xml");
        ignite.cluster().state(ClusterState.ACTIVE);
    }
}


try (Ignite ignite = Ignition.start("examples/config/ignite-server-config.xml")) {
    IgniteCache cache = ignite.getOrCreateCache("query_cache");
    cache.put("1", "value-1");
    System.out.println(cache.get("1"));
}

This is working fine, but after stopping IgniteServer I can't restart it again with following error:

[15:28:20] Initialized write-ahead log manager in NONE mode, persisted data may be lost in a case of unexpected node failure. Make sure to deactivate the cluster before shutdown.
[2021-03-18 15:28:20,876][ERROR][main][root] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Failed to read checkpoint record from WAL, persistence consistency cannot be guaranteed. Make sure configuration points to correct WAL folders and WAL folder is properly mounted [ptr=FileWALPointer [idx=0, fileOff=0, len=0], walPath=db/wal, walArchive=db/wal/archive]]]
class org.apache.ignite.internal.processors.cache.persistence.StorageException: Failed to read checkpoint record from WAL, persistence consistency cannot be guaranteed. Make sure configuration points to correct WAL folders and WAL folder is properly mounted [ptr=FileWALPointer [idx=0, fileOff=0, len=0], walPath=db/wal, walArchive=db/wal/archive]
    at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.performBinaryMemoryRestore(GridCacheDatabaseSharedManager.java:2269)
    at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:873)
    at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.notifyMetaStorageSubscribersOnReadyForRead(GridCacheDatabaseSharedManager.java:5022)
    at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1251)
    at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2052)
    at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1698)
    at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1114)
    at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1032)
    at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:918)
    at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:817)
    at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:687)
    at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:656)
    at org.apache.ignite.Ignition.start(Ignition.java:353)
    at org.apache.ignite.examples.atest.IgniteServer.main(IgniteServer.java:9)
[2021-03-18 15:28:20,881][ERROR][main][root] JVM will be halted immediately due to the failure: [failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Failed to read checkpoint record from WAL, persistence consistency cannot be guaranteed. Make sure configuration points to correct WAL folders and WAL folder is properly mounted [ptr=FileWALPointer [idx=0, fileOff=0, len=0], walPath=db/wal, walArchive=db/wal/archive]]]

Sometimes server is shutdown automatically with message:

FileWriteAheadLogManager: Reached logical end of the segment for file: /ignite/work/db/wal/node-xxxx/xxx.wal

I have disabled WAL log, I don't know why it still read checkpoint and failed. I checked $IGNITE_HOME/work/db/wal/node-xxx/, I found 10 wal files and all of them with size 67.1MB, seems there is a infinite loop and fill them all. After I deleted work folder I can start the serve again.

Questions:

  1. How can I fix this problem with native persistence on and WAL log off
  2. Seems like I shutdown server in a wrong way, how can I stop server safely by code without checking checkpoint?

Thanks.


Solution

    1. I advise against walMode=NONE. If you have to use it, make sure to remove the whole persistence directory before restarting node, or
    2. Try calling ignite.cluster().state(INACTIVE) before shutting down any nodes.