A.0 Sentinel 6.1 Rapid Deployment Troubleshooting Checklist

This checklist is provided to aid in diagnosing a problem. By filling in this checklist, you can solve common issues or reduce the amount of time needed to solve more complex issues.

Table A-1 Checklist

Checklist Item

Example

Novell Version

V6.1 Rapid Deployment

Novell Platform and OS Version

SUSE Linux Enterprise Server 10 SP2 or later

Database Platform and OS Version

PostgreSQL 8.3

Sentinel Server Hardware Configuration

  • Processor: 4 CPU @ 3 GHz

  • Memory: 5 GB RAM

  • Other

Database Storage Configuration (NAS, SAN, Local and so on.)

Local with offsite backup

Reporting Engine and Configuration

Jasper Report Engine

NOTE:Depending upon how your Sentinel system is configured, you might need to expand the above table. For instance additional information might be needed for Advisor, Sentinel Control Center, and Collector Manager.

  1. Check the Novell Customer Center for your particular issue:

    • Is this a known issue with a work-around?

    • Is this issue fixed in the latest patch release or hot-fix?

    • Is this issue currently scheduled to be fixed in a future release?

  2. Determine the nature of the problem.

    • Can it be reproduced? Can the steps to reproduce the problem be enumerated?

    • What user action, if any, will cause the problem?

    • Is the issue periodic in nature?

  3. Determine the severity of this problem.

    • Is the system still useable?

  4. Understand the environment and systems involved.

    • What platforms and product versions are involved?

    • Are there any non-standard or custom components involved?

    • Is it a high event rate environment?

    • What is the rate of events being collected?

    • What is the event rate of insertion into the database?

    • How many concurrent users are there?

    • Is correlation used? How many rules are deployed?

    Collect configuration files, log files and system information from appropriate subdirectories in <Install_Directory>. Assemble this information for possible future knowledge transfer.

  5. Check the health of the system.

    • Can you log into the Sentinel Control Center?

    • Are events being generated and inserted into the database?

    • Can events be seen on the Sentinel Control Center?

    • Can events be retrieved from the database using quick query?

    • Check the RAM usage, disk space, process activity, CPU usage and network connectivity of the hosts involved.

    • Verify all expected Sentinel processes are running. Use the command ps –ef|grep novell can be used.

    • Check for any core dumps in any of the sub-directories of <Install_Directory>. Find out which process core dumped.

      cd <Install_Directory>
      
      find . –name core –print
      
    • Make sure the ActiveMQ broker is running. Connectivity can be verified using the ActiveMQ management console. Check that the various connections are active from Novell processes. Make sure that a lock file is not preventing ActiveMQ from starting. Optionally telnet to that server on the port, telnet sentinel.company.com 61616.

    • Check whether the wrapper service is running on the server. (ps –ef | grep wrapper)

    • Are any errors visible in the Servers View of the Sentinel Control Center? Are any errors visible in the Event Source Management Live View in the Sentinel Control Center? What is the OS resource consumption on the Collector Managers?

  6. Is there a problem with the Database?

    • Using Pgadmin, can you log into the database?

    • Does the database allow a Pgadmin login using the Novell dbauser account into the SIEM schema?

    • Does querying on one of the table succeed?

    • Does a select statement on a database table succeed?

    • Check the JDBC drivers, their locations and class path settings.

    • Is the database being maintained by an administrator? By anyone?

    • Has the database been modified by that administrator?

    • Is SDM being used to maintain the partitions and archive/delete the partitions to make more room in the database?

    • Using SDM what is the current partition? Is it P_MAX?

  7. Inspect whether the product environment settings are correct.

    • Verify the sanity of User login shell scripts, environment variables, configurations, java home settings.

    • Are the environment variable set to run the correct jvm?

    • Verify the proper permissions on the folders for the installed product.

    • Check if any cron jobs are setup causing interference with our product’s functionality.

    • If the product is installed on NFS mounts, check the sanity of NFS mounts & NFS/NIS services.

  8. Is there a possible memory leak?

    • Obtain the statistics on how fast the memory is being consumed and by which process.

    • Gather the metrics of the events throughput per Collector.

    • Run the prstat command on Solaris. This will give the process runtime statistics.

    • In Windows you can check the process size and handle count in task manager.

    This issue, if not resolved, is now ready for escalation. Possible results of escalation are:

    • Configuration file changes

    • Hot fixes or patches to your system

    • Enhancement request

    • Temporary workaround.