Access Gateway does not start after upgrading to 4.4.4

  • 7024416
  • 08-Feb-2020
  • 13-Mar-2020

Environment

  • Access Manager Version 4.4.4
  • Access Gateway Appliance cluster

Situation

  • All Access Manager devices have been upgraded from version 4.4.1 to 4.4.4
  • After the upgrade only one Access Gateway Appliance out of a cluster fails to start returning: "org.apache.catalina.LifecycleException.error"

  • Catalina.out reports the following log entries:
SEVERE: ContainerBase.addChild: start:
org.apache.catalina.LifecycleException: Failed to start component [StandardEngine[Catalina].StandardHost[localhost].StandardContext[/nesp]]
        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:167)
        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:754)
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:730)
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:734)
        at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1140)
        at org.apache.catalina.startup.HostConfig$DeployDirectory.run(HostConfig.java:1875)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalStateException: Unable to complete the scan for annotations for web application [/nesp] due to a StackOverflowError. Possible root causes include a too low setting for -Xss and illegal cyclic inheritance dependencies. The class hierarchy being processed was [org.bouncycastle.asn1.ASN1EncodableVector->org.bouncycastle.asn1.DEREncodableVector->org.bouncycastle.asn1.ASN1EncodableVector]

Resolution

  • This kind of error often indicates a JAVA version conflict or duplicate application files.
  • No differences have been identified by comparing the Embedded Service Provider (ESP) application folder at: "webapp/WEB-INF/lib/" folder between the working and non working Access Gateway Appliances
  • However, when the folder was compared to a "clean" lab server, there was an extra file found (Bouncy Castle Crypto package).
/opt/novell/nesp/lib/webapp/WEB-INF/lib/bcprov-jdk14-119.jar
/opt/novell/nesp/lib/webapp/WEB-INF/lib/bcprov-jdk15on-157.jar

  • A fresh installation of NAM 4.4.4 Access Gateway Appliance did not include a JDK14 version of the Bouncy Castle Crypto package
  • After removing the bcprov-jdk14-119.jar the appliance started without any further problems
  • The jdk14 file was also removed from the working access gateways as a precaution.

Cause

The cause was an older version of the Bouncy Castle Crypto package compiled for JDK14 getting loaded on one node only.
Since the release of NAM 4.0 jdk15 has been used. This lib must therefore be a leftover from NAM 3.2.x

The load order on just one node was different for an unknown reason. All nodes were configured and upgraded identically but there must have been some environmental issue that causes the jar files to load in a different order on this server.