1.2 Troubleshooting Orchestration Server Issues

The following sections provide solution to the problems you might encounter while using the Orchestration Server:

Orchestration Server Might Appear to Be Deadlocked When Provisioning Large Numbers of Jobs with Subjobs

Source: Cloud Manager Orchestration Server
Explanation: In some deployments where a large number of running jobs spawn subjobs, the running jobs might appear to stop, leaving jobs in the queue.
Possible Cause: This occurs because of job limits set in the Orchestration Server to avoid overload or “runaway” conditions.
Action: If this deadlock occurs, you can slowly adjust the job limits to tune them according to your deployment. For more information, see Job Limits Panel in the NetIQ Cloud Manager 2.1.5 Orchestration Console Reference.

Orchestration Server Might Hang if the System Clock Is Changed Abruptly

Source: Cloud Manager Orchestration Server
Explanation: As with many applications, you should avoid abrupt changes in the system clock on the machine where the Orchestration Server is installed; otherwise, the agent might appear to hang, waiting for the clock to catch up.

This issue is not affected by changes in clock time occurring from daylight saving adjustments.

Action: We recommend that you use proper clock synchronization tools such as a Network Time Protocol (NTP) server in your network to avoid large stepping of the system clock.

Authentication to an Active Directory Server Might Fail

Source: Cloud Manager Orchestration Server
Explanation: A simplified Active Directory Server (ADS) setup might be insufficient because of a customized ADS install (for example, namingContexts entries that generate referrals when they are looked up).
Possible Cause: The checking logic in the current AuthLDAP auth provider assumes that if any namingContext entry is returned, it has found the domain and it stops searching.
Action: If you encounter this issue, you need to manually configure LDAP as a generic LDAP server, which offers many more configuration options.

The Orchestration Server Must Have Sufficient RAM

Source: Cloud Manager Orchestration Server
Explanation: If the Orchestration Server fails to start after installation and configuration, sufficient RAM might not be installed on your hardware or assigned to the VM you are attempting to use.
Possible Cause: The Orchestration Server requires 3 GB of RAM to function with the preset defaults.
Action: If the server does not start, increase your physical RAM size (or, for a VM, increase the setting for virtual RAM size). Alternatively, you can reduce the JVM heap size, as explained in Validating and Optimizing the Orchestration Configuration in the NetIQ Cloud Manager 2.1.5 Orchestration Installation Guide.

Calling terminate() from within a Job Class Allows the JDL Thread Execution to Continue

Source: Cloud Manager Orchestration Server
Explanation: Calling terminate() from within the Job class does not immediately terminate the JDL thread of that job; instead, it sends a message to the server requesting termination of the job.
Action: This can take time to occur (because subjobs need to be recursively terminated and joblets cancelled), so if the calling JDL thread needs to terminate immediately, immediately follow the invocation of this method with return.

Java programs That Use the JDL Exec Class Might Hang

Source: Cloud Manager Orchestration Server
Explanation: Processes that are spawned by using the JDL Exec class on a Windows Orchestration Agent might hang when the spawned process attempts to read from stdin.
Action: To work around this issue, use the following steps to turn off the enhanced ExecWrapper:
  1. In the Explorer tree of the Orchestration Console, select the job that you want to change.

  2. In the admin view of the job, select the JDL Editor tab to open the JDL Editor.

  3. Paste the following code into the editor:

    e = Exec()
    e.setUseJvmRuntimeExec(True)
    
  4. Save the changes.

NOTE:Disabling the enhanced ExecWrapper also makes other process control features provided as part of the ExecWrapper unavailable, such as running the process as a different user than the Orchestration Agent, or redirection of files (Exec.setStdoutFile, Exec.setStderrFile and Exec.setStrinFile).

For more information about the JDL Exec class, see the Cloud Manager 2 JDL documentation.