I m trying to cache a large dataset of some tables, My server is centos based with 8Go ram and 500Go disk space
I configured my local storage policy to persist and after getting a file open limit issue I tried to make to to 2 000 000 following theses steps
vi /etc/sysctl.conf
fs.file-max = 2000000 (2 million)
:wq
sysctl -p
but even using this fix
and setting the work directory on chmod -x I m still having the following error prompt
SEVERE: Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize partition file: /home/grid-gain-server/gridgain-community-8.7.7/work/db/node00-3273af50-1e97-47fa-a237-29e7dfc2d987/cache-COrderCache/part-56.bin]]
class org.apache.ignite.internal.processors.cache.persistence.StorageException: Failed to initialize partition file: /home/grid-gain-server/gridgain-community-8.7.7/work/db/node00-3273af50-1e97-47fa-a237-29e7dfc2d987/cache-COrderCache/part-56.bin
at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:448)
at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:337)
at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:478)
at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:462)
at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:853)
at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:694)
at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.getOrAllocatePartitionMetas(GridCacheOffheapManager.java:1679)
at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.init0(GridCacheOffheapManager.java:1507)
at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:2137)
at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:429)
at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:4261)
at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.initialValue(GridCacheMapEntry.java:3407)
at org.apache.ignite.internal.processors.cache.GridCacheEntryEx.initialValue(GridCacheEntryEx.java:771)
at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.loadEntry(GridDhtCacheAdapter.java:683)
at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.access$600(GridDhtCacheAdapter.java:103)
at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter$5.apply(GridDhtCacheAdapter.java:633)
at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter$5.apply(GridDhtCacheAdapter.java:629)
at org.apache.ignite.internal.processors.cache.store.GridCacheStoreManagerAdapter$3.apply(GridCacheStoreManagerAdapter.java:535)
at org.apache.ignite.cache.store.jdbc.CacheAbstractJdbcStore$1.call(CacheAbstractJdbcStore.java:469)
at org.apache.ignite.cache.store.jdbc.CacheAbstractJdbcStore$1.call(CacheAbstractJdbcStore.java:433)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.file.FileSystemException: /home/grid-gain-server/gridgain-community-8.7.7/work/db/node00-3273af50-1e97-47fa-a237-29e7dfc2d987/cache-COrderCache/part-56.bin: Too many open files
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newAsynchronousFileChannel(UnixFileSystemProvider.java:196)
at java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:248)
at java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:301)
at org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.<init>(AsyncFileIO.java:56)
at org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory.create(AsyncFileIOFactory.java:43)
at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:420)
... 23 more
Nov 24, 2019 4:54:51 PM java.util.logging.LogManager$RootLogger log
SEVERE: JVM will be halted immediately due to the failure: [failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize partition file: /home/grid-gain-server/gridgain-community-8.7.7/work/db/node00-3273af50-1e97-47fa-a237-29e7dfc2d987/cache-COrderCache/part-56.bin]]
what could I do to fix IT
Adding the following configuration was enough for me to avoid this exception
vi /etc/security/limits.conf
root soft nofile 10240
root hard nofile 20480
Then in /etc/sysctl.conf
I appended the max watcher config
fs.inotify.max_user_watches=524288
Knowing that root is my user account name
The values are experimental I m not sure if this is safe but I hadn't any remarquable issue in my VM
I didn't drop the previous configuration
A reboot was needed
Credit to @Stephen Darlington
Just to explain what's going on here: fs.file-max
sets an overall limit for the operating system. The stuff in limits.conf
set limits for each user. The only other thing I would add is that if you're running Ignite as a user other than root (recommended) you'd change that users limits.