First of all, i have two environments strictly identically configured (exept IP) with two vm each. One in pre-production and one in production (currently in configuration phase). There is one vm with a liferay 7.0.6 tomcat bundle (build from 7.0.6-cumulative patch of the community-security-team) and an other with postgresql 9.4.26.
Everything works fine on pre-production environment.
On production environment, a few hours after beginning to create users in liferay i ran into this error (full stack at the end) :
Caused by: java.sql.SQLTransientConnectionException: HikariPool-2 - Connection is not available, request timed out after 937980ms.
at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:591)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:194)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:146)
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:112)
at org.springframework.jdbc.datasource.LazyConnectionDataSourceProxy$LazyConnectionInvocationHandler.getTargetConnection(LazyConnectionDataSourceProxy.java:403)
at org.springframework.jdbc.datasource.LazyConnectionDataSourceProxy$LazyConnectionInvocationHandler.invoke(LazyConnectionDataSourceProxy.java:376)
at com.sun.proxy.$Proxy7.prepareStatement(Unknown Source)
at org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:534)
at org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:452)
at org.hibernate.jdbc.AbstractBatcher.prepareQueryStatement(AbstractBatcher.java:161)
at org.hibernate.loader.Loader.prepareQueryStatement(Loader.java:1700)
at org.hibernate.loader.Loader.doQuery(Loader.java:801)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:274)
at org.hibernate.loader.Loader.loadEntity(Loader.java:2037)
... 50 more
So i checked if there is differences between my liferay configuration on pre-production and the one in production using comparison software, and exept IP i found nothing. Idem with postgresql configurations on both environment.
I also checked time synchronization between vm and they are both synchronize via ntp on debian pool server.
Database vm :
n$ ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
0.debian.pool.n .POOL. 16 p - 64 0 0.000 0.000 0.002
1.debian.pool.n .POOL. 16 p - 64 0 0.000 0.000 0.002
2.debian.pool.n .POOL. 16 p - 64 0 0.000 0.000 0.002
3.debian.pool.n .POOL. 16 p - 64 0 0.000 0.000 0.002
+mail.klausen.dk 193.67.79.202 2 u 116 1024 377 14.435 -0.214 0.358
+any.time.nl 85.199.214.99 2 u 871 1024 377 1.666 -0.183 0.258
-rag.9t4.net 131.188.3.221 2 u 102 1024 377 16.491 -3.769 0.571
*ntp1.m-online.n 212.18.1.106 2 u 318 1024 377 16.608 -0.263 0.240
-tor-relais1.lin 131.188.3.223 2 u 199 1024 377 14.149 0.272 0.661
-www.kashra.com .DCFp. 1 u 150 1024 377 22.623 1.126 0.816
and Liferay vm :
$ ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
0.debian.pool.n .POOL. 16 p - 64 0 0.000 0.000 0.002
1.debian.pool.n .POOL. 16 p - 64 0 0.000 0.000 0.002
2.debian.pool.n .POOL. 16 p - 64 0 0.000 0.000 0.002
3.debian.pool.n .POOL. 16 p - 64 0 0.000 0.000 0.002
+stratum2-4.ntp. 129.70.137.82 2 u 200 1024 377 30.471 1.069 2.414
+138.201.16.225 131.188.3.221 2 u 696 1024 377 16.613 1.357 0.397
-kuehlich.com 131.188.3.221 2 u 373 1024 377 22.656 2.025 0.885
-time.cloudflare 10.21.8.19 3 u 566 1024 377 8.123 -0.640 0.277
*a.chl.la 131.188.3.222 2 u 167 1024 377 14.472 1.033 2.448
+195.50.171.101 145.253.3.52 2 u 266 1024 377 10.804 0.928 0.395
I also notice an error line in postgresql log about a request of process cancelation on unknown PID happening exactly the amout (937980ms) of milliseconds before the timeout error in Liferay:
LOG: PID 1767 in the cancel request does not match any process
I have tried re-installing Liferay from scratch but nothing change. It should exist a difference between pre-production and production because it works fine on pre-production but i can't find it.
HikariCP configuration in liferay is default on both environment
jdbc.default.connectionTimeout=30000
jdbc.default.driverClassName=org.postgresql.Driver
jdbc.default.idleConnectionTestPeriod=60
jdbc.default.idleTimeout=600000
jdbc.default.initialPoolSize=10
jdbc.default.liferay.pool.provider=hikaricp
jdbc.default.maxActive=100
jdbc.default.maxIdleTime=3600
jdbc.default.maxLifetime=0
jdbc.default.maxPoolSize=100
jdbc.default.maximumPoolSize=100
jdbc.default.minIdle=10
jdbc.default.minPoolSize=10
jdbc.default.minimumIdle=10
, and same for postgresql :
max_connections = 100
HikariPool full satck from liferay :
2021-06-30 00:12:32.397 ERROR [liferay/scheduler_dispatch-6][JDBCExceptionReporter:234] HikariPool-2 - Connection is not available, request timed out after 937980ms.
2021-06-30 00:12:32.401 ERROR [liferay/scheduler_dispatch-6][BasePersistenceImpl:264] Caught unexpected exception
com.liferay.portal.kernel.exception.SystemException: com.liferay.portal.kernel.dao.orm.ORMException: org.hibernate.exception.GenericJDBCException: could not load an entity: [com.liferay.counter.model.impl.CounterImpl#com.liferay.counter.kernel.model.Counter]
at com.liferay.portal.kernel.service.persistence.impl.BasePersistenceImpl.processException(BasePersistenceImpl.java:270)
at com.liferay.counter.service.persistence.impl.CounterFinderImpl._obtainIncrement(CounterFinderImpl.java:391)
at com.liferay.counter.service.persistence.impl.CounterFinderImpl._competeIncrement(CounterFinderImpl.java:339)
at com.liferay.counter.service.persistence.impl.CounterFinderImpl._competeIncrement(CounterFinderImpl.java:325)
at com.liferay.counter.service.persistence.impl.CounterFinderImpl.increment(CounterFinderImpl.java:111)
at com.liferay.counter.service.persistence.impl.CounterFinderImpl.increment(CounterFinderImpl.java:100)
at com.liferay.counter.service.persistence.impl.CounterFinderImpl.increment(CounterFinderImpl.java:95)
at com.liferay.counter.service.impl.CounterLocalServiceImpl.increment(CounterLocalServiceImpl.java:42)
at sun.reflect.GeneratedMethodAccessor638.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.liferay.portal.spring.aop.ServiceBeanMethodInvocation.proceed(ServiceBeanMethodInvocation.java:163)
at com.liferay.portal.spring.transaction.CounterTransactionExecutor.execute(CounterTransactionExecutor.java:50)
at com.liferay.portal.spring.transaction.TransactionInterceptor.invoke(TransactionInterceptor.java:58)
at com.liferay.portal.spring.aop.ServiceBeanMethodInvocation.proceed(ServiceBeanMethodInvocation.java:137)
at com.liferay.portal.spring.aop.ServiceBeanAopProxy.invoke(ServiceBeanAopProxy.java:169)
at com.sun.proxy.$Proxy78.increment(Unknown Source)
at com.liferay.counter.kernel.service.CounterLocalServiceUtil.increment(CounterLocalServiceUtil.java:238)
at com.liferay.portal.kernel.systemevent.SystemEventHierarchyEntryThreadLocal.push(SystemEventHierarchyEntryThreadLocal.java:134)
at com.liferay.portal.kernel.systemevent.SystemEventHierarchyEntryThreadLocal.push(SystemEventHierarchyEntryThreadLocal.java:96)
at com.liferay.portal.repository.capabilities.TemporaryFileEntriesCapabilityImpl._runWithoutSystemEvents(TemporaryFileEntriesCapabilityImpl.java:313)
at com.liferay.portal.repository.capabilities.TemporaryFileEntriesCapabilityImpl.deleteExpiredTemporaryFileEntries(TemporaryFileEntriesCapabilityImpl.java:113)
at com.liferay.document.library.web.internal.messaging.TempFileEntriesMessageListener.deleteExpiredTemporaryFileEntries(TempFileEntriesMessageListener.java:111)
at com.liferay.document.library.web.internal.messaging.TempFileEntriesMessageListener$1.performAction(TempFileEntriesMessageListener.java:134)
at com.liferay.document.library.web.internal.messaging.TempFileEntriesMessageListener$1.performAction(TempFileEntriesMessageListener.java:130)
at com.liferay.portal.kernel.dao.orm.DefaultActionableDynamicQuery.performAction(DefaultActionableDynamicQuery.java:405)
at com.liferay.portal.kernel.dao.orm.DefaultActionableDynamicQuery$1.call(DefaultActionableDynamicQuery.java:315)
at com.liferay.portal.kernel.dao.orm.DefaultActionableDynamicQuery$1.call(DefaultActionableDynamicQuery.java:277)
at com.liferay.portal.kernel.dao.orm.DefaultActionableDynamicQuery.doPerformActions(DefaultActionableDynamicQuery.java:335)
at com.liferay.portal.kernel.dao.orm.DefaultActionableDynamicQuery.performActions(DefaultActionableDynamicQuery.java:86)
at com.liferay.document.library.web.internal.messaging.TempFileEntriesMessageListener.doReceive(TempFileEntriesMessageListener.java:139)
at com.liferay.portal.kernel.messaging.BaseMessageListener.receive(BaseMessageListener.java:26)
at com.liferay.portal.kernel.scheduler.messaging.SchedulerEventMessageListenerWrapper.receive(SchedulerEventMessageListenerWrapper.java:66)
at com.liferay.portal.kernel.messaging.InvokerMessageListener.receive(InvokerMessageListener.java:74)
at com.liferay.portal.kernel.messaging.ParallelDestination$1.run(ParallelDestination.java:52)
at com.liferay.portal.kernel.concurrent.ThreadPoolExecutor$WorkerTask._runTask(ThreadPoolExecutor.java:756)
at com.liferay.portal.kernel.concurrent.ThreadPoolExecutor$WorkerTask.run(ThreadPoolExecutor.java:667)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.liferay.portal.kernel.dao.orm.ORMException: org.hibernate.exception.GenericJDBCException: could not load an entity: [com.liferay.counter.model.impl.CounterImpl#com.liferay.counter.kernel.model.Counter]
at com.liferay.portal.dao.orm.hibernate.ExceptionTranslator.translate(ExceptionTranslator.java:34)
at com.liferay.portal.dao.orm.hibernate.SessionImpl.get(SessionImpl.java:205)
at com.liferay.portal.kernel.dao.orm.ClassLoaderSession.get(ClassLoaderSession.java:326)
at com.liferay.counter.service.persistence.impl.CounterFinderImpl._obtainIncrement(CounterFinderImpl.java:369)
... 36 more
Caused by: org.hibernate.exception.GenericJDBCException: could not load an entity: [com.liferay.counter.model.impl.CounterImpl#com.liferay.counter.kernel.model.Counter]
at org.hibernate.exception.SQLStateConverter.handledNonSpecificException(SQLStateConverter.java:140)
at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:128)
at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66)
at org.hibernate.loader.Loader.loadEntity(Loader.java:2041)
at org.hibernate.loader.entity.AbstractEntityLoader.load(AbstractEntityLoader.java:86)
at org.hibernate.loader.entity.AbstractEntityLoader.load(AbstractEntityLoader.java:76)
at org.hibernate.persister.entity.AbstractEntityPersister.load(AbstractEntityPersister.java:3294)
at org.hibernate.event.def.DefaultLoadEventListener.loadFromDatasource(DefaultLoadEventListener.java:496)
at org.hibernate.event.def.DefaultLoadEventListener.doLoad(DefaultLoadEventListener.java:477)
at org.hibernate.event.def.DefaultLoadEventListener.load(DefaultLoadEventListener.java:227)
at org.hibernate.event.def.DefaultLoadEventListener.lockAndLoad(DefaultLoadEventListener.java:403)
at org.hibernate.event.def.DefaultLoadEventListener.onLoad(DefaultLoadEventListener.java:155)
at org.hibernate.impl.SessionImpl.fireLoad(SessionImpl.java:1090)
at org.hibernate.impl.SessionImpl.get(SessionImpl.java:1075)
at org.hibernate.impl.SessionImpl.get(SessionImpl.java:1066)
at com.liferay.portal.dao.orm.hibernate.SessionImpl.get(SessionImpl.java:201)
... 38 more
Caused by: java.sql.SQLTransientConnectionException: HikariPool-2 - Connection is not available, request timed out after 937980ms.
at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:591)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:194)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:146)
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:112)
at org.springframework.jdbc.datasource.LazyConnectionDataSourceProxy$LazyConnectionInvocationHandler.getTargetConnection(LazyConnectionDataSourceProxy.java:403)
at org.springframework.jdbc.datasource.LazyConnectionDataSourceProxy$LazyConnectionInvocationHandler.invoke(LazyConnectionDataSourceProxy.java:376)
at com.sun.proxy.$Proxy7.prepareStatement(Unknown Source)
at org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:534)
at org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:452)
at org.hibernate.jdbc.AbstractBatcher.prepareQueryStatement(AbstractBatcher.java:161)
at org.hibernate.loader.Loader.prepareQueryStatement(Loader.java:1700)
at org.hibernate.loader.Loader.doQuery(Loader.java:801)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:274)
at org.hibernate.loader.Loader.loadEntity(Loader.java:2037)
... 50 more
I'm actually out of other idea to test, any help would be thankfull.
Finally we found the solution.
On pre-production environment the two vms are on the same VLAN, which was not the case on production.
Solution: putting the vms on the same VLAN solve the problem.