While starting Wildfly, if a single step takes more than 10 minutes to complete, the start process will rollback and report the following error:
ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0348: Timeout after [600] seconds waiting for service container stability. Operation will roll back. Step that first updated the service container was 'add' at address '[
("core-service" => "management"),
("management-interface" => "http-interface")
]'
This error can happen for a variety of reasons, most commonly stemming from slow server performance. This article will cover a short-term workaround to allow more time before the timeout is hit. Note that it is possible that since this error stems from server slowness, increasing the timeout helps eliminate the impact of slowness, but does not resolve the slowness issue itself.
Temporary workaround - increasing timeout
To increase the timeout, navigate to the following file:
<infogix install dir>/wildfly/IV/configuration/standalone-full-ha.xml
Locate the following record:
<system-properties>
<property name="jboss.as.management.blocking.timeout" value="600"/>
</system-properties>
And increase the value parameter to 1200 (for 20 minutes) or 2400 (for 40 minutes). This will allow more time for the start to finish.
The change above would be lost during the next deploy, so the location below also needs to be updated within the Infogix appserver.advanced.properties
property file. However, it must be manually updated into the standalone-full-ha.xml first due to the JVM needing to startup successfully before being able to deploy the change below:
WILDFLY_MGMT_TIMEOUT=1200
Diagnosing the root cause
The container stability error occurs due to slowness during startup. The Wildfly server log will contain more information or clues. A common scenario within Infogix deployments of multiple JVMs is a jgroup cluster cache issue.
Comments
0 comments
Please sign in to leave a comment.