NetIQ Documentation: NetIQ Cloud Manager 2.1.5 Orchestration Console Reference - Understanding the Orchestration Server Events System

B.3 Understanding the Orchestration Server Events System

The the Orchestration Server Event System integrates with the Job Scheduler. Event notifications can start jobs and can also invoke Event handler methods in long-running jobs. In turn, a job can react to the Event by starting other server actions, by modifying object attributes, or by executing another external process.

For example, an Event notification can occur when a VM Host has exceeded its configured load limits. This Event can start a job that migrates VMs off of the loaded VM Host or VM Hosts.

The Orchestration Server supports two Event types:

Built-in Events, such as change of status of the health of a resource's or a change in the online status of the resource.
Rule-based Events that are triggered when the attributes of an object satisfy the rules (constraint conditions) defining the Event.

This section includes the following information:

B.3.1 Event Notification

An Event notifies two other Cloud Manager Orchestration services, the Job Scheduler and the Job Broker. The Job Scheduler starts jobs that are awaiting an Event to trigger them. The Job Broker invokes a callback on any long-running job that has registered for notification of an Event.

See Section 13.0, The Orchestration Server Job Scheduler for more information about setting up a Job Schedule.

B.3.2 Built-in Events

Built-in events occur when a managed object comes online/offline or has a health status change.

The Orchestration Server uses the following built-in Events to keep managed objects synchronized.

Table B-1 The Orchestration Server Built-in Events

Event Name	Description
AGENT_VERSION_MISMATCH	Resource Agent version mismatch (agent needs upgrade)
REPOSITORY_HEALTH	Repository health status has changed
RESOURCE_HEALTH	Resource health status has changed
RESOURCE OFFLINE	Resource Agent has logged out of the server
RESOURCE_ONLINE	Resource Agent has logged in to the server
SERVER_UP	Server has fully started
USER_HEALTH	User health status has changed
USER_ONLINE	User has logged in to the server
VMHOST_ADDED	VM Host has been added
VMHOST_HEALTH	VM Host health status has changed
VMHOST_NOT_AVAILABLE	No VM Host is available
VMHOST_REMOVED	VM Host has been removed

For example, when a resource comes online (that is, the agent connects to the server), the RESOURCE_ONLINE Event is fired and both scheduled jobs with a trigger for that Event and long-running jobs with Event handlers are notified.

The RESOURCE_ONLINE built-in Event is used by the embedded discovery jobs, such as for discovering operating system and CPU information (osInfo and cpuInfo jobs). Both osInfo and cpuInfo job archives (.job) include a schedule file (.sched) specifying a trigger (.trig) that allows these jobs be started when notification of the RESOURCE_ONLINE Event occurs.

B.3.3 Rule-based Events

Rule-based Events are defined in an XML document. They are deployed to the Orchestration Server and managed through the Orchestration Console. Rules can be a simple object attribute (fact) equivalency check or they can use AND,OR, IF, ELSE logic, among other things, in an Event ruleset.

The rules follow the same syntax as the constraints that are defined in XML policy files for all Grid Objects, such as Jobs, VM Hosts, etc.

The Orchestration Server Event Service evaluates the rules; if the rules pass, an Event notification occurs.

The XML Schema document specification can be found in <install dir>/doc/xsds/event_1_0_0.xsd.

The Event XML specification is composed of three sections.

<context>
<trigger>
<reset>

NOTE:Both the <context> and <trigger> sections are required.

<context> section

The <context> section defines the context in which the Event rules are evaluated. With Events, you specify what objects are in the Event rule context in this section. The available objects are Job, Jobinstance, Resource, Repository, User, and VMHost. From these objects, you can specify one object set to iterate over and optionally a single instance of the object.

<trigger> section

The <trigger> section defines the rules for when an Event notification occurs. The <trigger> format is the same syntax as <constraints> used in policies.

<reset> section

The optional <reset> section defines the rules for when an Event is reset. If the <reset> rule is not used, an Event is reset based on a timeout. The <reset> format is also the same syntax as in <constraints> used in policies.

The resetInterval attribute is set on the <event> XML element. If "resetInterval" and <reset> are not used, the default timeout for resetting is 10 minutes.

The following example, taken from the "vmhost.event" in <install dir>/examples/events), defines that a notification occurs when a VM host becomes overloaded.

1<event>
2
3   <context>
4      <vmhost />
5   </context>
6
7   <trigger>
8      <gt fact="vmhost.vm.count" value="0" />
9      <gt fact="vmhost.resource.loadaverage" value="2" />
10   </trigger>
11
12   <reset>
13      <lt fact="vmhost.resource.loadaverage" value=".5" />
14   </reset>
15
16</event>

Lines 3-5: This section defines the context for the Event’s rule evaluation.

Line 4: The context specifies all VM host objects, so the Event Service iterates over all VM hosts. On each VM host, the <trigger> rule will be evaluated, so in this case, the Event context is composed of one or more VM hosts.

Lines 7-12: This section defines the Trigger rule to determine if this Event is to fire notifications or not. If the trigger rule does not pass, no Event notifications occur.

Line 8: Consider only VM hosts that have at least one VM instance running.

Line 9: Check the running average of the VM host’s load average if it exceeds a threshold value. In this case, run the check if the average is greater than 2.

Lines 12-14: This section defines the Reset rule to determine if a previously triggered VM host can be reset and triggered again.

Line 13: Only reset if the running average of the VM host’s load average drops below a threshold.

When a VM host passes the trigger rule, the VM host does not pass the trigger rule again until the reset rule (load average drops below threshold) passes.

See the repository.event example (<install dir>/examples/events/repository.event) for an Event with a rule that evaluates the freespace fact on all repository objects.