2.3 Validating Configurations and Troubleshooting Common Issues with Start Up Self Tests

Operations Center self tests run automatically at start up to test configurations and verify the server is healthy before starting the application. These self tests check for common configuration problems and issue warnings or stop the server from starting if applicable. If problems are found, information is written to the server logs.

The following tests are run at start up:

  • AvailablePort (formula.web.server.port, formula.web.server.ssl.port, and rmi.port)

  • IsLocalHost/IP (formula.web.server.host and MOAddress)

  • VMMemory

  • DiskFreeSpace (formula.home)

  • HttpURLCheck (ImageServer.ExternalServer and ImageServer.InternalServer)

  • FilePermissions (formula.home)

  • JavaVersion

NOTE:Default test scripts are not editable and any changes to these scripts, except configurable test settings, are ignored. All test files are signed and if modified, a warning message is logged.

Configurable test settings include the critical flag and retry settings. See Section 2.3.2, Setting Tests to Halt Component Startup on Test Failure and Section 2.3.3, Updating Retry Settings for more information about these configurable settings.

The following sections address how to set various parameters on the self tests and the location of log files:

2.3.1 Understanding the Test Definition Repository and Test Script Directories

Self test scripts are located in the product component root directories under the Operations Center installation directory:

  • /OperationsCenter_install_path/SelfTestScripts

  • /OperationsCenter_CMS_install_path/SelfTestScripts

  • /OperationsCenter_Dashboard_install_path/SelfTestScripts

Under each of the SelfTestScripts directories, new directories are created for each test during the first run, such as:

  • /OperationsCenter_install_path/SelfTestScripts\NOC

  • /OperationsCenter_install_path/SelfTestScripts\NOC Daemon

Each of these test script directories contain the following two files:

  • definition.xml: defines the configuration of the test; where to retrieve values.

  • unit.xml: defines the hierarchical ordered set of tests to be run and the inputs required.

2.3.2 Setting Tests to Halt Component Startup on Test Failure

When a critical test fails, it log errors and halt the product component at startup. It is possible to set the critical flag on any test so that the component does not start on test failure. Setting a group of test to critical will halt product component start up if any of the tests in the group fails.

WARNING:Self tests ensure that your environment is properly configured and that disabling the criticality of tests could result in your components starting in an unusable state.

To change the critical flag for a test:

  1. Open the /OperationsCenter_ProductComponent_install_path/SelfTestScripts/script_name/unit.xml file in any text or XML editor.

    Where OperationsCenter_ProductComponent_install_path is the installation path for the Operations Center server, CMS or Dashboard.

  2. To log errors and halt the component from starting up, add or set the critical property in the test’s XML attributes as:

    critical="true"

    For example, the JavaVersion test would be.

    <test classname="com.mosol.selftest.tests.JavaVersion" critical="true" mandatory="true" name="java version">

  3. To continue to log errors but not halt component startup on test failure, set the critical attribute to false.

  4. Save the file.

2.3.3 Updating Retry Settings

When a test fails, it is possible to retry the execution of the test before calling the test a failure.

There are 2 XML attributes that can be set in the test’s declaration to configure retry settings:

  • retries: the number of times to retry the test after the first exeuction fails

  • retryDelay: the number of seconds to wait before attempting to retry the test

To update retry settings for a test:

  1. Open the /OperationsCenter_ProductComponent_install_path/SelfTestScripts/script_name\unit.xml file in any text or XML editor.

    Where OperationsCenter_ProductComponent_install_path is the installation path for the Operations Center server, CMS or Dashboard.

  2. To set the number or retries to perform, set the retries attribute. To set the number of seconds to wait before each attempt, set the retryDelay attribute.

    For example, the following test will retry twice if there is a problem and wait 3 seconds before each try:

    <test classname="com.mosol.selftest.tests.HttpURLCheck" critical="true" mandatory="true" name="internal image server" retries="2" retryDelay="3">

  3. Save the file.

2.3.4 Understanding Self Test Message Logs and Errors

During normal operations, self tests provide informational logging to describe the tests run and the actions performed by the tests. Logs can be found in the standard log file of the product component for which the tests are being run. For example:

  • /OperationsCenter_install_path/logs

  • /OperationsCenter_CMS_install_path/logs

The following is an example of an informational log message for a self test:

2010-07-28 10:34:30,655 INFO Self Tests/NOC Daemon/java version - Minimum Supported Java version is :1.6.0

  • Where the log level is INFO. The level of logging is customizable for a group of tests or a single test by adding or setting logLevel="loglevel" to the XML attributes of the test, with acceptable loglevel values of DEBUG, INFO, WARN or ERROR.

  • And, where the log category is Self Tests/NOC Daemon/java version. The log category of the log entry is the fully pathed name of the test outputting the log.

Error logs can indicate problems in your configuration. Use the Operations Center Customizer to edit the configuration as appropriate. Table 2-1 lists the possible cause and action for various error messages.

Table 2-1 Sample Error Message in Self Test Logs

Error Message

Possible Cause and Action

An IOException occurred binding socket.

The port noted in previous messages might already be in use by another application. Free the port or change the relevant port value in the Operations Center Customizer.

permissions_level access is denied on 'filename'

The current user (running Operations Center) does not have the required permissions. Verify that the user has the appropriate permissions and reset file permissions if necessary.

Exception attempting to reach URL

The server might not be running or might not have completed its initialization. Check that the server is running and initialized, and retry the connection. Adjust the retries settings if necessary.

host_address is not a local address.

The IP address to which this host address resolves is not assigned to this machine. Verify that the host specified in the Customizer is the correct host name of the machine.

Unknown host "host_address"

The IP address for the host cannot be found. Verify that the host specified in the Customizer is the correct host name of the machine.

Critical test "test_name" failed. Stopping execution

A test marked critical has failed, and the server has stopped. Review the above errors to correct possible configuration issues.

Execution of test "test_name" failed in duration ms. (number_of_attempts attempts)

A test failed, possibly after multiple attempts, but server startup has not been halted. Review the above error logs to correct possible configuration issue.