Workflow automation with NetIQ Operations Center and NetIQ Aegis

Martin Cotter

By: Martin Cotter

January 29, 2014 12:17 pm

Reads: 293

Comments:0

Rating:0

Aegis doesn’t need application specific adapters to integrate with applications.  No programming or scripting required – you will need some experience with Aegis though!

Please note this procedure is based on NetIQ Operations Center 5.0 with Patch Bundle 7. – It may work exactly as is for other versions of Operations Center unless there are changes on the Operations Center side which may require some small changes.

NetIQ Operations Center is a great central resource for event and business service information in your enterprise, so hooking this data into Aegis is a really powerful combination.

Operations Center provides a SOAP webservice – so Aegis can use this as an integration point using the SOAP Webservice activity out of the box.

Ok – so let’s provide a simple use case and see how the Integration would work. This use case is basically to turn Alarms in Operations Center into Events in Aegis, allowing Aegis to correlate and trigger workflows based on individual or combinations of events from Operations Center and other systems, or append/block or wait for events in a running workitem. Operations Center will need to be configured to have a Warehouse for storing Historical Alarms as this will be the data we request in this example.

The technique used here can be used as a template for any simple event polling workflow, only the integration point with the target system would need to change. This will be a simple scheduled workflow, which polls for alarm data over specific time range and then converts those alarms into Aegis events.

So first of all I will show a possible solution using Aegis Workflow, so you have a general idea of what is to come, and something to refer back to when doing the technical bit! Please note this is a very simplistic implementation, it rather assumes ideal world and only has one error handler for a condition that will definitely happen. In the real-world this should be made far more robust and scalable, but that is outside the scope of this demo!

(click to enlarge)

noc1

Ok so the basic plan is this:

  1. Trigger Workflow based on a Schedule, for example every 1 minute.
  2. Logon to NetIQ Operations Center.
  3. Get Server Info
  4. Calculate date range to poll for NOC alarms
  5. Request all alarms in range.
  6. Logout of webservice.
  7. Parse the results into individual alarms.
  8. Generate event in Aegis for each individual alarm.
  9. Save the LastPolltime for next poll period.

That’s what we will do in the following description. It is always important to have a flowchart of the workflow you are going to build so ensure all logic and features are implemented correctly and as efficiently as possible.

So first things first – we will integrate using the Operations Center Soap WebService. Aegis has a Soap Webservice activity which will discover and expose the required method parameters in the workflow designer, so out of the box we have our method to communicate with the webservice.

For details on the webservice itself, check out the Operations Center documentation for more details –

https://www.netiq.com/documentation/noc50/web_services/data/bookinfo.html

– it may vary dependent on version! This demo just touches on the features available in the Soap Webservice for Operation Center!

For the examples that follow, sigea-sles is my NetIQ Operation Center server, and 8084 is my webservice port.

Ok lets follow the same numbering system again so it links back to the overview:

  1. Trigger Based on a Schedule, for example every 1 minute

    Nothing NOC specific here. All we do is create an event Schedule for your desired poll period, and configure your workflow to trigger on that event schedule.

  2. Logon to NetIQ Operations Center.

    We will use the ‘Call a Webservice’ activity for all the activities which interact with the NOC Soap Webservice. Point the activity to this URL to discover the webservice methods:

    http://sigea-sles:8084/wsapi/services/Moswsapi_1_0

    noc2

    If you select the login method, you will see you need to specify your NOC login credentials as username, passwordhash and hashType. hashType is an enumerated value with a value of MD5, and the password is a Base64-Encoded value, so first you need to hash your password in a Base64-Encoded using MD5 hash algorithm in order to login. Luckily there is an activity here which does just that in one simple step!

    https://www.netiq.com/communities/cool-solutions/cool_tools/aegis-depot-activity-hashgenerator/

    The activity has two inputs – the Algorithm to use and whatever string value you want to hash. At runtime it looks like this – my Input value is my password (as it is stored in an Aegis Password attribute in this case, is encrypted so can’t be seen). It also has two outputs – base64 and HEX – we want the base64 version.

    noc3

    The configuration of the Login activity is like this:

    noc4

    And at runtime … you’ll see that we get a soap response containing the session key and expiration time of the key. We really only need the session key as in this example we will be logging out after each poll (keeping it simple), but you could keep track of the key expiration time and re-login when this session is expired.

    noc5

    We can parse the output using the regular expression activity (probably the handiest activity there is!) so we can make available each of the three response values in an array with this regular expression : .*<expiryTime>(.*)</expiryTime>.*<key>(.*)</key>.*<lastRefreshTime>(.*)</lastRefreshTime>.*

    At runtime:

    noc6

    So there we are logged in, and we have the session key required to perform other tasks on the NOC server while the session is active.

  3. Get Server Info

    This step isn’t really required, but it is a handy ‘heartbeat’ method. Basically all we do is run the GetServerInfo Method which returns information like Server Status messages, session timeout duration, server start time and it also extends the session by the timeout duration which is really handy if you want the workitem to run longer than the session timeout. Again the response is a Soap response, so you’ll need to parse for the values you want.

  4. Calculate date range to poll for NOC alarms

    Ok this step will involve a small bit of logic. Each time we poll for Alarms we will want to poll for Alarms in new time range. We do this by polling between StartTime and EndTime. On the Next poll, Starttime will equal Endtime + 1 second. So we need to store Endtime somewhere accessible like a Global Property, File or Database for example. The EndTime value can come from a source like the starttime of the workitem. We can also add a delay of X seconds so that we are well past the Endtime of the range when we poll. This is pure Aegis again and not specific to NOC.

  5. Request all alarms in range.

    This time we use the webservice activity again, and change to use the getHistoricalAlarms method.

    Now we get to reuse the session key we got during login in the key value of the session input.

    ElementDName is the Element in NOC where you want to poll for alarms. To get all alarms, choose: root=Elements

    Or for a specific child element to start collecting alarms from like : netiq=sigea-multi2+-+AM+Adapter/root=Elements

    There is a caveat to what level you start the search – if you search from top level, you get alarms from different types of adapters, so alarm formats will be different when it comes to parsing them. For simplicity I will only pull alarms from one adapter, the NetIQ AppManager Adapter.

    You can get this format from right clicking the element in NOC console and choosing properties.

    noc7

    For channelName use HISTORY. Available Channel names are returned by the GetServerInfo Method.

    From and To we use the date range calculation from step 4.

    MaxRecords is a way to throttle the number of results. If there are more results than the max, a pointer to the next set of results is returned so these can be retrieved, but this is outside the scope of this demo.

    The configured Activity should look something like this:

    noc8

  6. Logout of webservice
  7. Parse the results into individual alarms.

    Get Historical Alarms returns a SOAP response contains all the alarms wrapped up in XML, so this data needs first be broken down into individual alarms. This can be done using the Xpath activity which will parse the XML and output individual matches in an array, which in this case will be an array of alarms.

    The XPath expression to get the data we want is this:

    //*[local-name()='alarms']/*[local-name()='item']

    This will parse out the individual <alarms><item> elements which contain the alarms

    noc9

  8. Generate event in Aegis for each individual alarm.

    Now we loop through the array of alarms and generate the events! Each individual alarm is still wrapped in XML though, so while we loop we also parse the elements. This time I will use the Regular Expression Activity to parse the XML into the parts of the alarm I want to keep.

    This is a long Regex, but it is relatively simple when you break it down. It extracts the following fields from the Alarm – remember this is for a specific adapter so it would need to be modified to fit.

    elementDName, eventID, parentEventID, jobid, SeverityCode, Status, FirstOccurTime, LastOccurTime, KPName, EventMsg, MachineName

    Regex:

    .*<elementDName>(.*)</elementDName>.*<name>EventID</name>.*?<value>(.*?)</value>.*<name>ParentEventID</name>.*?<value>(.*?)</value>.*<name>JobID</name>.*?<value>(.*?)</value>.*<name>SeverityCode</name>.*?<value>(.*?)</value>.*<name>Status</name>.*?<value>(.*?)</value>.*<name>FirstOccurTime</name>.*?<value>(.*?)</value>.*<name>LastOccurTime</name>.*?<value>(.*?)</value>.*<name>KPName</name>.*?<value>(.*?)</value>.*<name>EventMsg</name>.*?<value>(.*?)</value>.*<name>MachineName</name>.*?<value>(.*?)</value>

    Which outputs the alarm attributes I require:

    noc10

    Now you can use these fields to create events via the ‘Create Aegis Event’ activity, so you can define NOC event definitions, correlate and trigger workitems etc.

    noc11

  9. Save the LastPolltime for next poll period

    Once you loop through all the NOC alarms, all that is left is to save the lastpolltime so that it can be used in the next polling workflow, storing whereever step 4. reads it from.

And thats it!

For reference below is a sample output from the getHistoricalAlarms method call containing two alarms:

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
<getHistoricalAlarmsResponse xmlns="http://wsapi.mosol.com">
<getHistoricalAlarmsReturn>
<alarms>
<item>
<alarmClassName xsi:nil="true" />
<elementDName>NT_CPUNumber=0/NT_CPUFolder=CPU/NT_MachineFolder=SIGEA-MULTI2/NT_GenericFolder=Dummy/NT_GenericFolder=Master/netiq=sigea-multi2+-+AM+Adapter/root=Elements</elementDName>
<elementTempId>1389026208</elementTempId>
<fields>
<item>
<displayName>EventID</displayName>
<name>EventID</name>
<value>22122</value>
</item>
<item>
<displayName>Parent ID</displayName>
<name>ParentEventID</name>
<value>17587</value>
</item>
<item>
<displayName>Job ID</displayName>
<name>JobID</name>
<value>193</value>
</item>
<item>
<displayName>DepJobID</displayName>
<name>DepJobID</name>
<value>0</value>
</item>
<item>
<displayName>RepositoryName</displayName>
<name>RepositoryName</name>
<value>
</value>
</item>
<item>
<displayName>SeverityCode</displayName>
<name>SeverityCode</name>
<value>Severe</value>
</item>
<item>
<displayName>SeverityNum</displayName>
<name>SeverityNum</name>
<value>10</value>
</item>
<item>
<displayName>Status</displayName>
<name>Status</name>
<value>OPEN</value>
</item>
<item>
<displayName>FirstOccurTime</displayName>
<name>FirstOccurTime</name>
<value>Wed Jan 01 00:02:20 GMT 2014</value>
</item>
<item>
<displayName>Last Occurrence</displayName>
<name>LastOccurTime</name>
<value>1388534540,0</value>
</item>
<item>
<displayName>ObjID</displayName>
<name>ObjID</name>
<value>38</value>
</item>
<item>
<displayName>KP Name</displayName>
<name>KPName</name>
<value>NT_CpuLoaded</value>
</item>
<item>
<displayName>TypeObjName</displayName>
<name>TypeObjName</name>
<value>NT_CPUNumber:0</value>
</item>
<item>
<displayName>Severity</displayName>
<name>Severity</name>
<value>10</value>
</item>
<item>
<displayName>Message</displayName>
<name>EventMsg</name>
<value>Windows Performance: Total virtual processor utilization is high</value>
</item>
<item>
<displayName>Occurrence</displayName>
<name>Occurrence</name>
<value>1</value>
</item>
<item>
<displayName>ModificationTime</displayName>
<name>ModificationTime</name>
<value xsi:nil="true" />
</item>
<item>
<displayName>UserAcknowledged</displayName>
<name>UserAcknowledged</name>
<value>
</value>
</item>
<item>
<displayName>Machine Name</displayName>
<name>MachineName</name>
<value>SIGEA-MULTI2</value>
</item>
<item>
<displayName>KSGName</displayName>
<name>KSGName</name>
<value>
</value>
</item>
<item>
<displayName>KSName</displayName>
<name>KSName</name>
<value>
</value>
</item>
<item>
<displayName>Category</displayName>
<name>Category</name>
<value>
</value>
</item>
<item>
<displayName>TimeActiveBias</displayName>
<name>TimeActiveBias</name>
<value>0</value>
</item>
<item>
<displayName>LastOccurConsoleTime</displayName>
<name>LastOccurConsoleTime</name>
<value>Wed Jan 01 00:02:20 GMT 2014</value>
</item>
<item>
<displayName>FirstOccurConsoleTime</displayName>
<name>FirstOccurConsoleTime</name>
<value>Wed Jan 01 00:02:20 GMT 2014</value>
</item>
<item>
<displayName>HasComment</displayName>
<name>HasComment</name>
<value>0</value>
</item>
<item>
<displayName>WasNullAttrs</displayName>
<name>WasNullAttrs</name>
<value>
</value>
</item>
<item>
<displayName>FirstOccurTimeString</displayName>
<name>FirstOccurTimeString</name>
<value>2014-01-01 00:02:20 GMT+00:00</value>
</item>
<item>
<displayName>LastOccurTimeString</displayName>
<name>LastOccurTimeString</name>
<value>2014-01-01 00:02:20 GMT+00:00</value>
</item>
<item>
<displayName>FCOName</displayName>
<name>FCOName</name>
<value>
</value>
</item>
</fields>
<iconDescriptors>
<item>
<extension>.gif</extension>
<name>NT_CPUNumber</name>
<type>SMALL_IMAGE</type>
</item>
<item>
<extension>.gif</extension>
<name>NT_CPUNumber</name>
<type>LARGE_IMAGE</type>
</item>
</iconDescriptors>
<id>22122</id>
<lastUpdate>2014-01-01T00:02:20.000Z</lastUpdate>
<persistentId>22122</persistentId>
<severity>CRITICAL</severity>
</item>
<item>
<alarmClassName xsi:nil="true" />
<elementDName>NT_CPUNumber=0/NT_CPUFolder=CPU/NT_MachineFolder=SIGEA-MULTI2/NT_GenericFolder=Dummy/NT_GenericFolder=Master/netiq=sigea-multi2+-+AM+Adapter/root=Elements</elementDName>
<elementTempId>1389026208</elementTempId>
<fields>
<item>
<displayName>EventID</displayName>
<name>EventID</name>
<value>22123</value>
</item>
<item>
<displayName>Parent ID</displayName>
<name>ParentEventID</name>
<value>17580</value>
</item>
<item>
<displayName>Job ID</displayName>
<name>JobID</name>
<value>191</value>
</item>
<item>
<displayName>DepJobID</displayName>
<name>DepJobID</name>
<value>0</value>
</item>
<item>
<displayName>RepositoryName</displayName>
<name>RepositoryName</name>
<value>
</value>
</item>
<item>
<displayName>SeverityCode</displayName>
<name>SeverityCode</name>
<value>Severe</value>
</item>
<item>
<displayName>SeverityNum</displayName>
<name>SeverityNum</name>
<value>10</value>
</item>
<item>
<displayName>Status</displayName>
<name>Status</name>
<value>OPEN</value>
</item>
<item>
<displayName>FirstOccurTime</displayName>
<name>FirstOccurTime</name>
<value>Wed Jan 01 00:04:18 GMT 2014</value>
</item>
<item>
<displayName>Last Occurrence</displayName>
<name>LastOccurTime</name>
<value>1388534658,0</value>
</item>
<item>
<displayName>ObjID</displayName>
<name>ObjID</name>
<value>38</value>
</item>
<item>
<displayName>KP Name</displayName>
<name>KPName</name>
<value>NT_CpuLoaded</value>
</item>
<item>
<displayName>TypeObjName</displayName>
<name>TypeObjName</name>
<value>NT_CPUNumber:0</value>
</item>
<item>
<displayName>Severity</displayName>
<name>Severity</name>
<value>10</value>
</item>
<item>
<displayName>Message</displayName>
<name>EventMsg</name>
<value>Windows Performance: Total virtual processor utilization is high</value>
</item>
<item>
<displayName>Occurrence</displayName>
<name>Occurrence</name>
<value>1</value>
</item>
<item>
<displayName>ModificationTime</displayName>
<name>ModificationTime</name>
<value xsi:nil="true" />
</item>
<item>
<displayName>UserAcknowledged</displayName>
<name>UserAcknowledged</name>
<value>
</value>
</item>
<item>
<displayName>Machine Name</displayName>
<name>MachineName</name>
<value>SIGEA-MULTI2</value>
</item>
<item>
<displayName>KSGName</displayName>
<name>KSGName</name>
<value>
</value>
</item>
<item>
<displayName>KSName</displayName>
<name>KSName</name>
<value>
</value>
</item>
<item>
<displayName>Category</displayName>
<name>Category</name>
<value>
</value>
</item>
<item>
<displayName>TimeActiveBias</displayName>
<name>TimeActiveBias</name>
<value>0</value>
</item>
<item>
<displayName>LastOccurConsoleTime</displayName>
<name>LastOccurConsoleTime</name>
<value>Wed Jan 01 00:04:18 GMT 2014</value>
</item>
<item>
<displayName>FirstOccurConsoleTime</displayName>
<name>FirstOccurConsoleTime</name>
<value>Wed Jan 01 00:04:18 GMT 2014</value>
</item>
<item>
<displayName>HasComment</displayName>
<name>HasComment</name>
<value>0</value>
</item>
<item>
<displayName>WasNullAttrs</displayName>
<name>WasNullAttrs</name>
<value>
</value>
</item>
<item>
<displayName>FirstOccurTimeString</displayName>
<name>FirstOccurTimeString</name>
<value>2014-01-01 00:04:18 GMT+00:00</value>
</item>
<item>
<displayName>LastOccurTimeString</displayName>
<name>LastOccurTimeString</name>
<value>2014-01-01 00:04:18 GMT+00:00</value>
</item>
<item>
<displayName>FCOName</displayName>
<name>FCOName</name>
<value>
</value>
</item>
</fields>
<iconDescriptors>
<item>
<extension>.gif</extension>
<name>NT_CPUNumber</name>
<type>SMALL_IMAGE</type>
</item>
<item>
<extension>.gif</extension>
<name>NT_CPUNumber</name>
<type>LARGE_IMAGE</type>
</item>
</iconDescriptors>
<id>22123</id>
<lastUpdate>2014-01-01T00:04:18.000Z</lastUpdate>
<persistentId>22123</persistentId>
<severity>CRITICAL</severity>
</item>
</alarms>
<cursor>1389112205923</cursor>
<remainingAlarmssCount>179</remainingAlarmssCount>
</getHistoricalAlarmsReturn>
<session>
<expiryTime>2014-01-07T16:35:05.923Z</expiryTime>
<key>W2NvdHRlcm06NF0xMzg5MTEyMjAxNjI4</key>
<lastRefreshTime>2014-01-07T16:30:05.923Z</lastRefreshTime>
</session>
</getHistoricalAlarmsResponse>
</soapenv:Body>
</soapenv:Envelope>

 

VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

Tags: ,
Categories: Aegis, Operations Center, Technical Solutions

Disclaimer: As with everything else at NetIQ Cool Solutions, this content is definitely not supported by NetIQ, so Customer Support will not be able to help you if it has any adverse effect on your environment.  It just worked for at least one person, and perhaps it will be useful for you too.  Be sure to test in a non-production environment.

Comment