4.14 Health

This Knowledge Script is obsolete, although you can continue to use it. Its functionality is distributed among the following Knowledge Scripts introduced with AppManager for Exchange Server 2007 version 7.3:

*******************************************************************************************************

Use this Knowledge Script to monitor the health of Exchange Server 2007 Server roles. This script monitors the following services and activities:

  • The Windows Event Log for errors and warnings that arise from any service name that contains the word exchange

  • Running status of all Exchange Server 2007 services

  • Clock synchronization with the Domain Controller

  • Response time to ActiveSync, Outlook Web Access, Outlook Web services, and the Autodiscovery service

  • Number of messages in queue and change in queue size

  • Status of send, receive, and foreign connectors

  • Speed of e-mail flow to a specified e-mail address or Mailbox server

  • Availability of offline address books and public folders

  • Accessibility of mailbox

  • Communication between Hub Transport server and Mailbox server

  • Synchronization between Hub Transport server and Edge Transport server

  • Response to SMTP requests

  • Replication health

  • Status of mounted and unmounted database

  • Available disk space

This script monitors and restarts the following Exchange Server 2007 services:

Mailbox Server Role Services

IIS Admin Service

Exchange Active Directory Topology

Exchange Information Store

Exchange Mailbox Assistance

Exchange Mail Submission

Exchange Replication Service

Exchange System Attendant

Exchange Search Indexer

Exchange Service Host

Exchange Transport Log Search

Search (Exchange)

World Wide Web Publishing Service

Client Access Server Role Services

IIS Admin Service

Exchange Active Directory Topology

Exchange File Distribution

Exchange Service Host

World Wide Web Publishing Service

Hub Transport Server Role Services

Exchange Active Directory Topology

Exchange EdgeSync

Exchange Transport

Exchange Transport Log Search

Edge Transport Server Role Services

Exchange ADAM

Exchange Credential Service

Exchange Transport

This script raises an event and displays information for the following Exchange Server 2007 parameters:

Parameter

What is Displayed

SummaryCopyStatus

Displays the current overall status of the Local Continuous Replication (LCR), Cluster Continuous Replication (CCR) and Single Copy Cluster (SCC) copies. The possible values for the SummaryCopyStatus parameter are:

  • Not Supported: The current configuration does not support continuous replication.

  • Disabled: The storage group and its database object have HasLocalCopy set to 0.

  • Failed: Database or logs are incompatible with each other, or the storage group is improperly configured for LCR.

  • Seeding: Database seeding is in progress.

  • Suspended: Transaction log copying and replay are stopped.

  • Healthy: Status is healthy and normal.

LastInspectedLogTime

Displays the time stamp on the target storage group of the last successful inspection of a transaction log file. The time stamp is displayed if the database is down, or the CopyQueueLength and ReplayQueueLength parameters exceed the recommended thresholds.

NOTE:In the case of LCR copy, this script checks and displays the SummaryCopyStatus and LastInspectedLogTime parameter values only if Exchange Server 2007 is installed in a non-clustered environment.

This script also raises an event for the following conditions:

  • The Knowledge Script job fails

  • Stopped services fail to start

  • The Windows Event Log contains errors and warnings

  • Thresholds are exceeded

  • Connectors are disabled

4.14.1 Prerequisites

If the AppManager agent service, netiqmc, is not running under the Local System account, then ensure that the user account running the service is a member of the following groups for the indicated server roles.

Server Roles

Membership Group

Client Access server role

The Health Knowledge Script monitors the health of vital Client Access server components such as ActiveSync, Outlook Web Services, and Outlook Web Access. This monitoring requires the creation of a test user and an associated mailbox. If a test user and a mailbox do not exist, AppManager creates them automatically when the Health Knowledge Script runs.

However, in order for AppManager to successfully create a test user and mailbox, the account under which the AppManager netiqmc service runs must be a member of certain groups and be endowed with the following permissions:

  • Exchange Organization Administrators group

  • Local Administrators group

After you run the Health Knowledge Script once with proper permissions and memberships in place and the test user and mailbox are created, a user with lesser privileges can run the Health Knowledge Script.

You can also create the test user and mailbox manually. For more information about creating the test user and mailbox, see your Microsoft Exchange Server 2007 documentation.

Edge Transport server role

Local Administrators group

Hub Transport server role

  • Exchange Organization Administrators group

  • Local Administrators group

If the AppManager agent service is running under the Local System account, all health monitoring will succeed except the monitoring of message queues.

Mailbox server role

  • Exchange Server Administrators group

  • Local Administrators group

4.14.2 Resource Objects

  • Exchange_ServerIcon

  • Exchange_ClientAccessServer

  • Exchange_EdgeTransportServer

  • Exchange_HubTransportServer

  • Exchange_MailboxServer

To monitor individual storage groups, mailbox databases, transport queues, and services, use the Objects tab to select the specific objects to monitor.

4.14.3 Default Schedule

By default, this script runs every 30 minutes.

4.14.4 Setting Parameter Values

Set the following parameters as needed:

Parameter

How to Set It

General Settings

Communicate only with Exchange Servers in the local domain?

Select Yes to allow the Health job to test only Exchange Servers in the same domain as the server on which you run the Health job.

When this option is unselected, certain health tests for the Client Access server and the Hub Transport server attempt to contact all Mailbox servers in your organization. These tests will fail if the Exchange accounts in one domain do not have access to other domains.

Job failure event notification

Event severity when job fails

Set the severity level, from 1 to 40, to indicate the importance of an event in which the Health job fails. The default is 5.

Raise event indicating active/passive cluster state?

Select Yes to raise an informational event indicating the current status of the cluster: active or passive. The default is No.

Monitor Exchange 2007 Server Health

All Server Roles

Status of Exchange 2007 services

Raise event if Exchange 2007 services are not running?

Select Yes to raise an event if at least one Exchange Server 2007 service is not running. The default is Yes.

Event severity when services are not running

Set the severity level, from 1 to 40, to indicate the importance of an event in which at least one Exchange Server 2007 service is not running. The default is 15.

Start services not currently running?

Select Yes to start Exchange Server 2007 services that are not running. The default is Yes.

Raise event if stopped services fail to restart?

Select Yes to raise an event if AppManager cannot restart Exchange Server 2007 services that are not running. The default is Yes.

Threshold - Timeout for service restart

Set the number of seconds that AppManager should wait for Exchange Server 2007 services to restart before raising an event. The default is 15 seconds.

Event severity when stopped services fail to start

Set the severity level, from 1 to 40, to indicate the importance of an event in which Exchange Server 2007 services fail to restart after the specified timeout period. The default is 5.

Windows Event Log

Raise event if errors are found?

Select Yes to raise an event if the Windows Event Log contains errors that arise from any service name that contains the word exchange. The default is Yes.

Event severity when errors are found

Set the severity level, from 1 to 40, to indicate the importance of an event in which the Windows Event Log contains error messages. The default is 10.

Raise event if warnings are found?

Select Yes to raise an event if the Windows Event Log contains warnings that arise from any service name that contains the word exchange. The default is Yes.

Event severity when warnings are found

Set the severity level, from 1 to 40, to indicate the importance of an event in which the Windows Event Log contains warning messages. The default is 20.

Clock Synchronization with Domain Controller

Comma-separated list of Domain Controllers to test

Use this parameter to limit the number of Domain Controller (DC) clocks that are tested for synchronization with the clock on the server running the Health Knowledge Script.

Leave this parameter blank to test all DC clocks in your organization.

The list can contain fully qualified hostnames, separated by commas. It can also contain patterns that use the following wildcards. Separate the patters with commas.

  • An asterisk (*) matches zero or more characters.

  • A question mark (?) matches a single character.

  • The braces ([]) match any single character included between the braces. Use a dash (-) to specify a range between the braces. For example, [a-z] matches any alphabetic character; [0-9] matches any number, and [aeiou] matches any vowel.

If a pattern contains wildcards, all fully qualified DC names that match the pattern are included in the synchronization test.

All matching is case-sensitive.

Raise event if clocks are not synchronized?

Select Yes to raise an event if the clock on the server running the Health Knowledge Script is not synchronized with the clock on the DC. The default is Yes.

Threshold - Maximum amount of clock offset

Set the maximum number of seconds that the server clock can be out of sync with the DC. For example, setting the threshold to 2 indicates that it is acceptable for the clock to be two seconds faster or slower than the clock on the DC. The default is 10 seconds.

If you want the server clock to be in sync with the DC clock, set this parameter to 0.

Event severity when clock offset exceeds threshold

Set the severity level, from 1 to 40, to indicate the importance of an event in which the clock synchronization offset exceeds the threshold you set. The default is 25.

Client Access Server Role

ActiveSync Connectivity

Raise event if ActiveSync connectivity test fails?

Select Yes to raise an event if AppManager cannot check connectivity to ActiveSync. The default is Yes.

Event severity when ActiveSync connectivity test fails

Set the severity level, from 1 to 40, to indicate the importance of an event in which AppManager cannot check connectivity to ActiveSync. The default is 15.

Raise event if response time is excessive?

Select Yes to raise an event if the amount of time it takes to connect to ActiveSync exceeds the threshold you set. The default is Yes.

Threshold - Response time for connectivity test

Set the number of milliseconds that AppManager should wait for connectivity with ActiveSync before raising an event. The default is 1000 ms. The minimum is 1 ms.

Event severity when response time exceeds threshold

Set the severity level, from 1 to 40, to indicate the importance of an event in which the time taken for testing connectivity to ActiveSync exceeds the threshold that you set. The default is 25.

Outlook Web Access Connectivity

Raise event if Outlook Web Access connectivity test fails?

Select Yes to raise an event if AppManager cannot check connectivity to Outlook Web Access. The default is Yes.

Event severity when Outlook Web Access connectivity test fails

Set the severity level, from 1 to 40, to indicate the importance of an event in which AppManager cannot check connectivity to Outlook Web Access. The default is 15.

Raise event if response time is excessive?

Select Yes to raise an event if the amount of time it takes to connect to Outlook Web Access exceeds the threshold you set. The default is Yes.

Threshold - Response time for connectivity test

Set the number of milliseconds that AppManager should wait to confirm connectivity with Outlook Web Access before raising an event. The default is 1000 ms. The minimum is 1 ms.

Event severity when response time exceeds threshold

Set the severity level, from 1 to 40, to indicate the importance of an event in which the time taken for testing connectivity to Outlook Web Access exceeds the threshold that you set. The default is 25.

Outlook Web Services Connectivity

Use SSL (HTTPS) for connectivity test?

Select Yes to use Secure Socket Layer (SSL) to test connectivity to Outlook Web services. The default is Yes.

If you select Yes, AppManager will use only SSL to test connectivity. If you clear the option, AppManager will first use SSL to test connectivity. If that attempt fails, AppManager will then try to test connectivity without using SSL.

Raise event if Outlook Web services connectivity test fails?

Select Yes to raise an event if AppManager cannot check connectivity to Outlook Web services. The default is Yes.

Event severity when Outlook Web services connectivity test fails

Set the severity level, from 1 to 40, to indicate the importance of an event in which AppManager cannot check connectivity to Outlook Web services. The default is 15.

Raise event if response time is excessive?

Select Yes to raise an event if the amount of time it takes to connect to Outlook Web services exceeds the threshold you set. The default is Yes.

Threshold - Response time for connectivity test

Set the number of milliseconds that AppManager should wait to confirm connectivity with Outlook Web Services before raising an event. The default is 1000 ms. The minimum is 1 ms.

Event severity when response time exceeds threshold

Set the severity level, from 1 to 40, to indicate the importance of an event in which the time taken for testing connectivity to Outlook Web services exceeds the threshold that you set. The default is 25.

Autodiscovery Service Connectivity

Event Notification

Raise event if Autodiscovery service connectivity test fails?

Select Yes to raise an event if AppManager cannot check connectivity to the Autodiscovery service. The default is Yes.

The Autodiscovery service allows Outlook 2007 clients and mobile devices to be recognized when they connect to the Client Access server.

Event severity when Autodiscovery service connectivity test fails

Set the severity level, from 1 to 40, to indicate the importance of an event in which AppManager cannot check connectivity to the Autodiscovery service. The default is 15.

Offline Address Book Availability

Raise event if offline address books cannot be downloaded?

Select Yes to raise an event if the Client Access server’s offline address books cannot be downloaded. The default is Yes.

Event severity when offline address books cannot be downloaded

Set the severity level, from 1 to 40, to indicate the importance of an event in which offline address books cannot be downloaded. The default is 15.

Public Folder Availability

Raise event if public folders are inaccessible?

Select Yes to raise an event if the Client Access server’s public folders are inaccessible. The default is Yes.

A public folder can be inaccessible for one of the following reasons:

  • The public folder database is unmounted

  • The Mailbox server is not running

  • The user does not have proper access permissions

Event severity when public folders are inaccessible

Set the severity level, from 1 to 40, to indicate the importance of an event in which public folders are inaccessible. The default is 15.

Edge Transport Server Role

Message Hygiene

Raise event if percentage of content-filtered messages exceeds threshold?

Select Yes to raise an event if the percentage of increase in content-filtered messages exceeds the threshold you set. The default is Yes.

Content-filtered messages are those that the Edge Transport server filters for security breaches such as, spam, viruses, or blocked e-mail addresses.

Threshold - Maximum percentage of messages filtered because of content

Set the maximum percentage that is acceptable for the increase in filtered messages between Knowledge Script job iterations. AppManager raises an event if the percentage exceeds the threshold. The default is 40%.

Event severity when percentage of filtered messages exceeds threshold

Set the severity level, from 1 to 40, to indicate the importance of an event in which the percentage of filtered messages exceeds the threshold you set. The default is 15.

Raise event if size of poison message queue exceeds threshold?

Select Yes to raise event if the number of messages in the poison message queue exceeds the threshold you set. The default is Yes.

The poison message queue is a quarantine destination for those messages that the Edge Transport server identifies as potentially fatal to the Exchange server 2007 server.

Threshold - Maximum number of messages in poison message queue

Set the maximum number of messages that can be quarantined in the poison message queue before an event is raised. The default is 5 messages.

Event severity when number of messages in poison message queue exceeds threshold

Set the severity level, from 1 to 40, to indicate the importance of an event in which the number of messages in the poison message queue exceeds the threshold you set. The default is 15.

Message Queues

Raise event for number of messages in queue?

Select Yes to raise an event if the number of messages in queue exceeds the threshold you set. The default is Yes.

Threshold - Maximum number of messages in queue

Set the maximum number of messages that can be in queue before an event is raised. The default is 1000 messages.

Event severity when queue size exceeds threshold

Set the severity level, from 1 to 40, to indicate the importance of an event in which the number of messages in queue exceeds the threshold you set. The default is 15.

Raise event if increase in queue size is excessive?

Select Yes to raise an event if the percentage of increase in queue size since the last job iteration exceeds the threshold you set. The default is Yes.

Threshold - Percentage increase in queue size since last job iteration

Set the maximum acceptable percentage of increase in queue size since the last job iteration. AppManager raises an event if the percentage of increase exceeds this value. The default is 50%.

Event severity when increase in queue size exceeds threshold

Set the severity level, from 1 to 40, to indicate the importance of an event in which the percentage of increase in queue size exceeds the threshold you set. The default is 15.

Send and Receive Connectors

Raise event if any send or receive connectors are disabled?

Select Yes to raise an event if a send or receive connector is disabled. The default is Yes.

The connector sends and receives e-mail from a Hub Transport server. The e-mail may be sent and received within the organization through the intranet or outside the organization through the Internet.

Event severity when any connectors are disabled

Set the severity level, from 1 to 40, to indicate the importance of an event in which a send or receive connector is disabled. The default is 15.

Raise event if receive connectors cannot respond to SMTP requests?

Select Yes to raise an event if a receive connector is unable to respond to SMTP (Simple Mail Transfer Protocol) requests. The default is Yes.

Event severity when receive connectors cannot respond to SMTP requests

Set the severity level, from 1 to 40, to indicate the importance of an event in which a receive connector is unable to respond to SMTP requests. The default is 15.

Raise event if send connectors cannot send mail to internal recipients?

Select Yes to raise an event if a send connector is unable to send e-mail from the Internet to your intranet. The default is Yes.

Event severity when send connectors cannot send mail to internal recipients

Set the severity level, from 1 to 40, to indicate the importance of an event in which a send connector is unable to send e-mail from the Internet to your intranet. The default is 15.

Hub Transport Server Role

Server Communication

Raise event if time of last Edge synchronization exceeds threshold?

Select Yes to raise an event if synchronization between the Edge Transport server and the Hub Transport server has not occurred within the last n minutes. The default is Yes.

Use the Threshold - Maximum number of minutes since last Edge synchronization parameter to determine the value of n.

Threshold - Maximum number of minutes since last Edge synchronization

Set the maximum number of minutes that should elapse since the last synchronization before an event is raised. The default is 30 minutes.

Event severity when time of last Edge synchronization exceeds threshold

Set the severity level, from 1 to 40, to indicate the importance of an event in which the number of minutes since the last synchronization exceeds the threshold you set. The default is 15.

Raise event if unable to communicate with any Mailbox servers?

Select Yes to raise an event when the Hub Transport server cannot communicate with the Mailbox databases on the Mailbox server. The default is Yes.

The Hub Transport server transports e-mail to and from the Mailbox server. Therefore, ensuring uninterrupted communication is vital to the health of your Exchange Server 2007 environment.

Event severity when unable to communicate with any Mailbox servers

Set the severity level, from 1 to 40, to indicate the importance of an event in which the Hub Transport server cannot communicate with the Mailbox databases on the Mailbox server. The default is 15.

Message Queues

Raise event for number of messages in queue?

Select Yes to raise an event if the number of messages in queue exceeds the threshold you set. The default is Yes.

Threshold - Maximum number of messages in queue

Set the maximum number of messages that can be in queue before an event is raised. The default is 1000 messages.

Event severity when queue size exceeds threshold

Set the severity level, from 1 to 40, to indicate the importance of an event in which the number of messages in queue exceeds the threshold you set. The default is 15.

Raise event if increase in queue size is excessive?

Select Yes to raise an event if the percentage of increase in queue size since the last job iteration exceeds the threshold you set. The default is No.

Threshold - Percentage of increase in queue size since last job iteration

Set the maximum acceptable percentage of increase in queue size since the last job iteration. If the percentage of increase exceeds this value, an event is raised. The default is 50%.

Event severity when increase in queue size exceeds threshold

Set the severity level, from 1 to 40, to indicate the importance of an event in which the percentage of increase in queue size exceeds the threshold you set. The default is 15.

Send, Receive, and Foreign Connectors

Raise event if any send, receive, or foreign connectors are disabled?

Select Yes to raise an event if a send, receive, or foreign connector is disabled. The default is Yes.

The receive connectors receive e-mail from an Edge Transport server, a Mailbox server, or from the Internet when an Edge role is not set up in the Exchange environment.

The send connectors send e-mail to the mailbox of the intended recipient or to the Edge Transport server for delivery to another domain.

The foreign connectors move e-mail to a server within the organization that does not communicate using SMTP.

Event severity when any connectors are disabled

Set the severity level, from 1 to 40, to indicate the importance of an event in which a send, receive, or foreign connector is disabled. The default is 15.

Raise event if receive connectors cannot respond to SMTP requests?

Select Yes to raise an event if a send, receive, or foreign connector is unable to respond to SMTP requests. The default is Yes.

Event severity when receive connectors cannot respond to SMTP requests

Set the severity level, from 1 to 40, to indicate the importance of an event in which a receive connector is unable to respond to SMTP requests. The default is 15.

Mailbox Server Role

Mail Flow

Target Mailbox server

Enter the hostname of the computer that hosts the Mailbox server with which you want to check connectivity. The hostname need not be fully qualified unless DNS lookup does not resolve the simple name.

This is an optional parameter.

Important Both the Mailbox server and the computer on which you run this Knowledge Script must be in the same Active Directory forest. To test e-mail flow to a Mailbox server that is not in the same Active Directory forest, use the E-mail address for recipient of test message parameter.

The connectivity test verifies that the local Mailbox server can send e-mail to the specified Mailbox server.

Email address for recipient of test message

Provide the e-mail address with which you want to check connectivity.

The connectivity test will verify that the local Mailbox server can send e-mail to the specified e-mail address.

Important Use this parameter to test e-mail flow to a Mailbox server that is not in the same Active Directory forest as the local Mailbox server. To test e-mail flow to a Mailbox server that is in the same Active Directory forest, use the Target Mailbox server parameter.

Raise event if mail flow test fails?

Select Yes to raise an event if AppManager cannot check connectivity to the e-mail address you specified. The default is Yes.

Event severity when mail flow test fails

Set the severity level, from 1 to 40, to indicate the importance of an event in which AppManager cannot check connectivity to the e-mail address you specified. The default is 15.

Raise event if response time is excessive?

Select Yes to raise an event if the amount of time it takes to connect to the e-mail address exceeds the threshold you set. The default is Yes.

Threshold - Response time for mail flow test

Set the number of milliseconds that AppManager should wait to confirm connectivity with the e-mail address before raising an event. The default is 1000 ms. The minimum is 1 ms.

Event severity when response time exceeds threshold

Set the severity level, from 1 to 40, to indicate the importance of an event in which the time taken for testing connectivity to the e-mail address exceeds the threshold that you set. The default is 25.

Mailbox Accessibility

Raise event if system mailbox cannot be accessed?

Select Yes to raise an event if the Mailbox server cannot access the system mailbox. The default is Yes.

The system mailbox is inaccessible when it does not exist.

Event severity when system mailbox cannot be accessed

Set the severity level, from 1 to 40, to indicate the importance of an event in which the Mailbox server cannot access the system mailbox. The default is 15.

Raise event if response time is excessive?

Select Yes to raise an event if the amount of time it takes to connect to the system mailbox exceeds the threshold you set. The default is Yes.

Threshold - Response time for mailbox accessibility test

Set the number of milliseconds that AppManager should wait to confirm connectivity with the system mailbox before raising an event. The default is 1000 ms. The minimum is 1 ms.

Event severity when response time exceeds threshold

Set the severity level, from 1 to 40, to indicate the importance of an event in which the time taken for testing connectivity to the system mailbox exceeds the threshold that you set. The default is 25.

Storage Group Replication

Raise event if replication is unhealthy?

Select Yes to raise an event if storage group replication is unhealthy. The default is Yes.

AppManager considers replication to be unhealthy if at least one of the following conditions exists:

  • The length of the copy queue for a storage group is greater than 3.

  • The length of the replay queue for a storage group is greater than 20.

  • The Exchange Server indicates that replication is unhealthy.

Event severity when replication is unhealthy

Set the severity level, from 1 to 40, to indicate the importance of an event in which storage group replication is unhealthy. The default is 15.

Database Status

Raise event if databases are unmounted?

Select Yes to raise an event if databases are unmounted. The default is Yes.

When a database is unmounted, the Exchange Server cannot store information in it or read information from it.

Event severity when databases are unmounted

Set the severity level, from 1 to 40, to indicate the importance of an event in which databases are unmounted. The default is 5.

Disk Space

Raise event if free space for database files is low?

Select Yes to raise an event if the amount of free space for database files falls below the threshold you set. The default is Yes.

Threshold - Minimum free disk space

Set the minimum amount of disk space that must be available to prevent an event from being raised. The default is 1024 MB.

Event severity when free disk space falls below threshold

Set the event severity, from 1 to 40, to indicate the importance of an event in which the amount of free disk space falls below the threshold. The default is 25.

Raise event if free space for log files is low?

Select Yes to raise an event if the amount of free space for log files falls below the threshold you set. The default is Yes.

Threshold - Minimum free disk space

Set the minimum amount of disk space that must be available to prevent an event from being raised. The default is 1024 MB.

Event severity when free disk space falls below threshold

Set the event severity, from 1 to 40, to indicate the importance of an event in which the amount of free disk space falls below the threshold. The default is 25.