9.1 Rules for Management Servers

Under normal conditions, you should not run any regularly scheduled monitoring jobs or reports on the computer you are using as the management server. If you avoid running jobs on the management server, you prevent resource competition between the agent services and the management server service and may improve management server processing capacity.

Specifically, you should avoid running jobs that perform remote monitoring operations. For example, you should not run the following jobs on the management server:

  • General_MachineDown

  • NT_RemoteServiceDown

  • General_EventLog and General_ASCIILog

  • Any Report Knowledge Scripts

Instead of running these Knowledge Scripts on the management server, you should select a specific agent computer to handle remote monitoring tasks and a specific agent computer for running reports.

In most cases, you can still use the remote monitoring Knowledge Scripts to monitor the availability of the management server, for example, by specifying the name of the management server in the list of computers to monitor when configuring the job, without running the job on the management server itself. You can also use Troubleshooter and NetIQctrl commands to check the operation of the management server and to diagnose problems and you can use the AMAdmin_MSHealth Knowledge Script to monitor the Windows event log for events generated by the management server. For information about using Troubleshooter and NetIQctrl, see Section 12.0, Troubleshooting and Diagnostic Tools.

9.1.1 Using Anti-virus Software

In addition to restricting the monitoring jobs you run directly on the management server, you should use caution in running anti-virus software on the management server. In particular, you should not perform any real-time anti-virus scanning of the following NetIQ folders:

  • AppManager\dat\pioc

  • AppManager\dat\mapqueue

  • AppManager\bin\Cache

  • Temp\NetIQ_Debug \computer_name

These folders are updated frequently, and real-time scanning can cause resource contention. Therefore, you should exclude these folders from any anti-virus scanning activity.

9.1.2 Checking Management Server Status

As you increase the number of agents you monitor with a management server, it is also important to monitor the operational health and performance of the management server itself. The key indicators you should watch to determine the health of the management server are summarized in the following table:

Performance Counter

What to Look for

Processor:

% Processor Time (All instances)

The percentage of processor time should remain less than or equal to a maximum of 80%. Although occasional spikes can be expected, the average percentage of processor time should not exceed 80%.

System:

Processor Queue Length

The number of threads in the processor queue should remain less than or equal to a maximum of 3 ready threads per processor.

If the number of threads in the processor queue begins to increase, it may indicate that the management server is becoming overloaded, for example, because it is attempting to process a large number of events or data points or because a slow connection has created a backlog of information to be transmitted.

NetIQms:

IOC Coll. Events Queued

IOC Data Queued

IOC Events Queued

These counters should remain at or near zero (0), which indicates the queues are not growing.

The IOC counters refer to disk-based queues that are used to store events and data when the management server is temporarily busy. Over time these should remain near empty, indicating events and data are being processed in a timely manner. If the queues grow over time, it indicates the management server cannot keep up with the load created by the agents.

If any of these counters consistently exceed the threshold indicated, or if the IOC counters grow continuously, it is an indication that the management server is either handling too many agents or that it is undersized for the load.