Search code examples
javaglassfishshutdownapplication-server

Glassfish cluster shuts down suddenly


I have the following problem. Our Glassfish 3.1 server is configured with a cluster for some heavy processing. From time to time, quite rarely, the cluster shuts down without any known reason. I do not find any exceptions in the log, just the server logging shut down messages. I am attaching 2 fragments of the log file, one from the DAS and the other from one of the cluster instances. All the nodes are on the same machine. Any help on this would most welcomed.

Thanks

DAS
11:46:45,322 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [file:/E:/Glassfish3/glassfish/domains/domain1/generated/jsp/managementtool-1.12.0.20/loader_1608711619/logback.xml]
11:46:45,416 |-ERROR in ch.qos.logback.classic.joran.action.InsertFromJNDIAction - [SMTPServer] has null or empty value
11:46:45,416 |-INFO in ch.qos.logback.classic.joran.action.InsertFromJNDIAction - Setting context variable [LOG_LOCATION] to [E:\\v1\\logs]
11:46:45,416 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.rolling.RollingFileAppender]
11:46:45,416 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [FILE]
11:46:45,478 |-INFO in c.q.l.core.rolling.TimeBasedRollingPolicy - No compression will be used
11:46:45,494 |-INFO in c.q.l.core.rolling.TimeBasedRollingPolicy - Will use the pattern E://v1//logs/managementtool/spm-%d{yyyy-MM-dd}.%i.log for the active file
11:46:45,494 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@468b2c18 - The date pattern is 'yyyy-MM-dd' from file name pattern 'E://v1//logs/managementtool/spm-%d{yyyy-MM-dd}.%i.log'.
11:46:45,494 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@468b2c18 - Roll-over at midnight.
11:46:45,494 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@468b2c18 - Setting initial period to Fri Jun 29 11:46:45 EEST 2012
11:46:45,494 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
11:46:45,588 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - Active log file name: E://v1//logs/managementtool/spm-2012-06-29.0.log
11:46:45,588 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - File property is set to [null]
11:46:45,588 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
11:46:45,603 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [STDOUT]
11:46:45,603 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
11:46:45,603 |-INFO in ch.qos.logback.classic.joran.action.RootLoggerAction - Setting level of ROOT logger to INFO
11:46:45,603 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[ROOT]
11:46:45,603 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting level of logger [com.mypackage.central] to DEBUG
11:46:45,603 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting level of logger [com.mypackage.managementtool] to DEBUG
11:46:45,603 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - End of configuration.
11:46:45,603 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@80e1ba6 - Registering current configuration as safe fallback     [#|2012-06-29T11:47:09.479+0300|INFO|glassfish3.1.1|javax.enterprise.system.tools.admin.org.glassfish.deployment.admin|_ThreadID=111;_ThreadName=Thread-2;|managementtool-1.12.0.20 was successfully deployed in 49,564 milliseconds.|#]

[#|2012-06-29T11:49:57.263+0300|INFO|glassfish3.1.1|ShoalLogger|_ThreadID=19;_ThreadName=Thread-2;|GMS1015: Received Group Shutting down message from member: server of group: POD_Processing_Cl01|#]

[#|2012-06-29T11:49:57.263+0300|INFO|glassfish3.1.1|ShoalLogger|_ThreadID=18;_ThreadName=Thread-2;|GMS1092: GMS View Change Received for group: POD_Processing_Cl01 : Members in view for MASTER_CHANGE_EVENT(before change analysis) are :
1: MemberId: POD_Processing_Cl01_ins01, MemberType: CORE, Address: 10.220.10.67:9171:228.9.103.177:16084:POD_Processing_Cl01:POD_Processing_Cl01_ins01
2: MemberId: POD_Processing_Cl01_ins01, MemberType: CORE, Address: 10.220.20.110:9181:228.9.103.177:16084:POD_Processing_Cl01:POD_Processing_Cl01_ins01
3: MemberId: POD_Processing_Cl01_ins01, MemberType: CORE, Address: 10.220.20.195:9134:228.9.103.177:16084:POD_Processing_Cl01:POD_Processing_Cl01_ins01
4: MemberId: POD_Processing_Cl01_ins02, MemberType: CORE, Address: 10.220.10.67:9106:228.9.103.177:16084:POD_Processing_Cl01:POD_Processing_Cl01_ins02
5: MemberId: POD_Processing_Cl01_ins02, MemberType: CORE, Address: 10.220.20.110:9152:228.9.103.177:16084:POD_Processing_Cl01:POD_Processing_Cl01_ins02
6: MemberId: POD_Processing_Cl01_ins02, MemberType: CORE, Address: 10.220.20.195:9090:228.9.103.177:16084:POD_Processing_Cl01:POD_Processing_Cl01_ins02
7: MemberId: server, MemberType: SPECTATOR, Address: 10.220.10.67:9157:228.9.103.177:16084:POD_Processing_Cl01:server
8: MemberId: server, MemberType: SPECTATOR, Address: 10.220.20.110:9154:228.9.103.177:16084:POD_Processing_Cl01:server
9: MemberId: server, MemberType: SPECTATOR, Address: 10.220.20.195:9149:228.9.103.177:16084:POD_Processing_Cl01:server
10: MemberId: server, MemberType: SPECTATOR, Address: 10.220.20.197:9129:228.9.103.177:16084:POD_Processing_Cl01:server

and

Cluster
2012-06-29 11:43:27,396 [com-mypackage-worker-Execution slot for task {runId=4121, environmentId=33}}] DEBUG envId{33} - sesId{} - com.mypackage.worker.taskprocessing.flow.stage.DispatcherNotificationStage[61] - Beginning stage WORKER_DISPATCHER_NOTIFICATION...
2012-06-29 11:43:27,396 [com-mypackage-worker-Execution slot for task {runId=4121, environmentId=33}}] INFO  envId{33} - sesId{} - com.mypackage.worker.communication.jms.JMSCompletionMessageSender[45] - Sending to Dispatcher JMS completion message for task {runId=4121, environmentId=33}, status=RUN_COMPLETED}...
2012-06-29 11:43:27,411 [com-mypackage-worker-Execution slot for task {runId=4121, environmentId=33}}] INFO  envId{33} - sesId{} - com.mypackage.worker.communication.jms.JMSCompletionMessageSender[45] - Sending to Dispatcher JMS completion message for task {runId=4121, environmentId=33}, status=RUN_COMPLETED}...DONE
2012-06-29 11:43:27,411 [com-mypackage-worker-Execution slot for task {runId=4121, environmentId=33}}] DEBUG envId{33} - sesId{} - com.mypackage.worker.taskprocessing.flow.stage.DispatcherNotificationStage[61] - Ending stage WORKER_DISPATCHER_NOTIFICATION...
2012-06-29 11:43:27,411 [com-mypackage-worker-Execution slot for task {runId=4121, environmentId=33}}] INFO  envId{33} - sesId{} - com.mypackage.worker.taskprocessing.slotpool.SlotPool[49] - Releasing execution slot for completed task {runId=4121, environmentId=33}}...
2012-06-29 11:43:27,411 [com-mypackage-worker-Execution slot for task {runId=4121, environmentId=33}}] INFO  envId{33} - sesId{} - com.mypackage.worker.taskprocessing.slotpool.SlotPool[49] - Releasing execution slot for completed task {runId=4121, environmentId=33}}...DONE
2012-06-29 11:43:27,411 [com-mypackage-worker-Execution slot for task {runId=4121, environmentId=33}}] DEBUG envId{33} - sesId{} - com.mypackage.worker.taskprocessing.slotpool.SlotPool[61] - Ending stage WORKER_SLOT_POOL...task {runId=4121, environmentId=33}} 
2012-06-29 11:43:27,411 [com-mypackage-worker-Execution slot for task {runId=4121, environmentId=33}}] DEBUG envId{33} - sesId{} - com.mypackage.worker.taskprocessing.ProcessingSlot[61] - Ending method doProcess()...
2012-06-29 11:43:27,411 [com-mypackage-worker-Execution slot for task {runId=4121, environmentId=33}}] DEBUG envId{33} - sesId{} - com.mypackage.worker.taskprocessing.ProcessingSlot[61] - Ending stage WORKER_SLOT...
2012-06-29 11:44:00,443 [admin-thread-pool-24850(5)] DEBUG envId{} - sesId{} - com.mypackage.worker.service.WorkerInstancePropertiesService[61] - Beginning method destroy()...
2012-06-29 11:44:00,459 [admin-thread-pool-24850(5)] DEBUG envId{} - sesId{} - com.mypackage.worker.service.WorkerInstancePropertiesService[61] - Ending method destroy()...
2012-06-29 11:44:00,459 [admin-thread-pool-24850(5)] DEBUG envId{} - sesId{} - com.mypackage.worker.WorkerImpl[61] - Beginning method destroy()...
2012-06-29 11:44:00,459 [admin-thread-pool-24850(5)] DEBUG envId{} - sesId{} - com.mypackage.worker.WorkerImpl[61] - Ending method destroy()...
2012-06-29 11:44:00,459 [admin-thread-pool-24850(5)] DEBUG envId{} - sesId{} - com.mypackage.worker.taskrepository.TaskPreallocationQueue[61] - Beginning method shutdown()...
2012-06-29 11:44:00,459 [admin-thread-pool-24850(5)] DEBUG envId{} - sesId{} - com.mypackage.worker.taskrepository.TaskPreallocationQueue[61] - Ending method shutdown()...
2012-06-29 11:44:00,459 [admin-thread-pool-24850(5)] INFO  envId{} - sesId{} - org.springframework.context.support.ClassPathXmlApplicationContext[1020] - Closing org.springframework.context.support.ClassPathXmlApplicationContext@aedb61f: startup date [Fri Jun 29 11:31:34 EEST 2012]; root of context hierarchy
2012-06-29 11:44:00,459 [admin-thread-pool-24850(5)] DEBUG envId{} - sesId{} - com.mypackage.appframework.persistence.PersistenceServiceImpl[74] - Destroying datasource for environment id : 33
2012-06-29 11:44:00,599 [admin-thread-pool-24850(5)] INFO  envId{} - sesId{} - com.mypackage.appframework.persistence.PersistenceServiceImpl[76] - Destroyed data source for environment id : 33
2012-06-29 11:44:00,599 [admin-thread-pool-24850(5)] INFO  envId{} - sesId{} - org.hibernate.impl.SessionFactoryImpl[925] - closing
2012-06-29 11:44:00,615 [admin-thread-pool-24850(5)] INFO  envId{} - sesId{} - org.springframework.jmx.export.MBeanExporter[429] - Unregistering JMX-exposed beans on shutdown
2012-06-29 11:44:00,646 [Duration logger] INFO  envId{} - sesId{} - com.mypackage.appframework.util.Consumer[75] - Duration logger thread is shutting down
2012-06-29 11:44:00,646 [Alert reporter] INFO  envId{} - sesId{} - com.mypackage.appframework.util.Consumer[75] - Alert reporter thread is shutting down
2012-06-29 11:44:00,709 [admin-thread-pool-24850(5)] INFO  envId{} - sesId{} - com.mypackage.appframework.cache.metadata.MetadataRegistrationService[78] - Successfully unregistered for environment with id 33 and name Env1
2012-06-29 11:44:00,724 [admin-thread-pool-24850(5)] INFO  envId{} - sesId{} - org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean[441] - Closing JPA EntityManagerFactory for persistence unit 'central'
2012-06-29 11:44:00,740 [admin-thread-pool-24850(5)] INFO  envId{} - sesId{} - org.hibernate.impl.SessionFactoryImpl[925] - closing
2012-06-29 11:44:00,740 [admin-thread-pool-24850(5)] DEBUG envId{} - sesId{} - com.mypackage.central.persistence.CentralDataSource[57] - Destroying central data source.
2012-06-29 11:44:00,818 [admin-thread-pool-24850(5)] DEBUG envId{} - sesId{} - com.mypackage.webframework.config.LogbackContextListener[77] - Stopping the loggerContext for worker on the context destroy event

Solution

  • It seems that there are multiple clusters on different machines (different IP addresses) that share the same GMS address and port. This is why when one cluster receives a shutdown message, the others receive it too and act accordingly.

    1: MemberId: POD_Processing_Cl01_ins01, MemberType: CORE, Address: 10.220.10.67:9171:228.9.103.177:16084:POD_Processing_Cl01:POD_Processing_Cl01_ins01
    2: MemberId: POD_Processing_Cl01_ins01, MemberType: CORE, Address: 10.220.20.110:9181:228.9.103.177:16084:POD_Processing_Cl01:POD_Processing_Cl01_ins01
    3: MemberId: POD_Processing_Cl01_ins01, MemberType: CORE, Address: 10.220.20.195:9134:228.9.103.177:16084:POD_Processing_Cl01:POD_Processing_Cl01_ins01
    4: MemberId: POD_Processing_Cl01_ins02, MemberType: CORE, Address: 10.220.10.67:9106:228.9.103.177:16084:POD_Processing_Cl01:POD_Processing_Cl01_ins02
    5: MemberId: POD_Processing_Cl01_ins02, MemberType: CORE, Address: 10.220.20.110:9152:228.9.103.177:16084:POD_Processing_Cl01:POD_Processing_Cl01_ins02
    6: MemberId: POD_Processing_Cl01_ins02, MemberType: CORE, Address: 10.220.20.195:9090:228.9.103.177:16084:POD_Processing_Cl01:POD_Processing_Cl01_ins02
    7: MemberId: server, MemberType: SPECTATOR, Address: 10.220.10.67:9157:228.9.103.177:16084:POD_Processing_Cl01:server
    8: MemberId: server, MemberType: SPECTATOR, Address: 10.220.20.110:9154:228.9.103.177:16084:POD_Processing_Cl01:server
    9: MemberId: server, MemberType: SPECTATOR, Address: 10.220.20.195:9149:228.9.103.177:16084:POD_Processing_Cl01:server
    10: MemberId: server, MemberType: SPECTATOR, Address: 10.220.20.197:9129:228.9.103.177:16084:POD_Processing_Cl01:server