- Data Collection Service Driver Walkthrough – Part 1
- Data Collection Service Driver Walkthrough – Part 2
- Data Collection Service Driver Walkthrough – Part 3
- Data Collection Service Driver Walkthrough – Part 4
- Data Collection Service Driver Walkthrough – Part 5
- Data Collection Service Driver Walkthrough – Part 6
- Data Collection Service Driver Walkthrough – Part 7
- Data Collection Service Driver Walkthrough – Part 8
With Novell Identity Manager 4.0 there are a number of new features available. You can read more about those features in these articles:
- What has changed between Identity Manager versions – Part 1
- What has changed between Identity Manager versions – Part 2
- What has changed between Identity Manager versions – Part 3
- More on what’s new in IDM 4
There are four new drivers, two are for new connected system supported (Salesforce.com and Sharepoint) and two are used as service drivers that are needed for the Reporting module.
These two drivers are the Managed System Gateway (MSG) driver, and Data Collection Service (DCS) driver, the first of which you can read about in this series of articles:
- Trying to understand the Managed System Gateway driver in IDM 4 – Part 1
- Trying to understand the Managed System Gateway driver in IDM 4 – Part 2
- Trying to understand the Managed System Gateway driver in IDM 4 – Part 3
- Trying to understand the Managed System Gateway driver in IDM 4 – Part 4
- Trying to understand the Managed System Gateway driver in IDM 4 – Part 5
- Trying to understand the Managed System Gateway driver in IDM 4 – Part 6
The Data Collection Service driver, is the second half of that pairing, and will be the topic of this series. Both these drivers are meant to enable the Reporting module to get enough information about the system to report upon it. The MSG driver is focused more on providing information about how the drivers are configured, heck it even tries to infer the matching rule criteria by reading the rules out of the objects, and the DCS driver is focused more on collecting events about objects for storage in the Reporting database.
In the first article Data Collection Service Driver Walkthrough – Part 1 I started looking at how it builds a cache variable and got through most of the work it does to get the correct IP Address of the server running the Managed System Gateway driver.
In the second article Data Collection Service Driver Walkthrough – Part 2 I finished working through how the cache is built.
In this article I look at how queries out of the cache are handled, and the filter for this driver.
In the Policy object NOVLIDMDCSB-itp-DispatchRegistrationQuery you can see how this driver handles incoming queries for data that is already cached. First things first it makes sure that it is the specific query of interest, by looking at the class-name XML attribute. These queries will be for a class of __DCS_REGISTRATION__ which is clearly not a real eDirectory object class.
This is very similar to how it was handled in the Managed System Gateway (MSG) driver that I discussed in the articles mentioned above. What is funny, is that in the version of the Package I am using, which is the 1.0.0 version, it is clear that they just copied and pasted the XML straight from one of the queries in the MSG driver, since the rule name is Resolve driver DN from GUID, which is a use case in the other driver.
However, the upside is that the way they wrote the policy it is quite independent, so it works fine as it is right now, without any issues.
On the one hand I am quite happy to see that many of these rules are tracing messages about what it is doing, which is great. However, still in the DirXML Policy, there are very few comments on WHY they are doing it. Thus we have the need for articles like these. However I am hopeful, since with Packages, it should be easier to do a documentation only update of a Package, and the scope of the various Packages is much smaller and therefore easier tasks to undertake. I.e. Smaller chunks are easier to accomplish.
The key to reading the data out of the cache is to remember it is a node set with real XML in it, and thus XPATH can be used to select.
The XPATH used is:
This means, looking in the local variable REGN_CACHE, under the <cache> node, select the <instance> node, whose XML attribute (indicated by the @ sign) class-name is the same as the variable query-id. This was set into a local variable by using the Class Name() token. They could have used the XPATH of @class-name to set that variable, or even the XML Attribute token, for class-name. However all three ways work equally well.
This selects the specific <instance> node, in case there is more than one.
When they trace out a message, since the XML here is somewhat larger than a single attribute, rather than XML Serialize() the node set selected, to trace it out as text, they wrap that XPATH in a boolean() function call so that if there is a value it returns true, and if there is no matching value it returns false. Leading to a nicer message in trace. Classy.
I always find XPATH easier to visualize if I have the XML I am trying to select from in front of me, so from the last article in this series, here is a sample of what the REGN_CACHE variable out to look like, somewhat stripped down.
<cache> <instance class-name="__DCS_REGISTRATION__" src-dn="driverDn"> <association>GUID Value</association> <attr attr-name="msgw-drv-address"> <value type="string">IP Address</value> </attr> <attr attr-name="msgw-drv-port"> <value></value> </attr> <attr attr-name="msgw-drv-context"> <value type="string"></value> </attr> </instance> </cache>
So if a query for class-name pf __DCS_REGISTRATON__ comes in, then this rule would fire and select the entire <instance> node shown.
Now selection is cloned by XPATH into the <operation-data> node of the <query> event. So the flow of events might look something like this. A query is received from the shim on the Publisher channel.
<nds dtdversion="3.5" ndsversion="8.x"> <source> <product build="4.0.0" instance="Data Collection Service Driver version="4.0.0">Identity Manager Data Collection Service Driver</product> <contact>Novell, Inc.</contact> </source> <input> <query class-name="__DCS_REGISTRATION__" scope="subtree"> <search-class class-name="__DCS_REGISTRATION__"/> <read-attr/> </query> </input> </nds>
The Input transform rule sees that it is a __DCS_REGISTRATION__ class query. Then the clone by XPATH copies the appropriate data from the cache into the <operation-data> data node, which the engine will return onto the Subscriber channel, for the Output transform to see.
<nds dtdversion="3.5" ndsversion="8.x"> <source> <product build="4.0.0" instance="Data Collection Service Driver version="4.0.0">Identity Manager Data Collection Service Driver</product> <contact>Novell, Inc.</contact> </source> <input> <query class-name="__DCS_REGISTRATION__" scope="subtree"> <search-class class-name="__DCS_REGISTRATION__"/> <read-attr/> <operation-data api-name="__DCS_REGISTRATION__"> <instance class-name="__DCS_REGISTRATION__" src-dn="driverDn"> <association>GUID Value</association> <attr attr-name="msgw-drv-address"> <value type="string">IP Address</value> </attr> <attr attr-name="msgw-drv-port"> <value></value> </attr> <attr attr-name="msgw-drv-context"> <value type="string"></value> </attr> </instance> </operation-data> </query> </input> </nds>
Then the output transform which will be discussed later, turns it back into a normal looking result document that the shim can process and do whatever it needs too, with it. In this case, use it to know how to connect to the target system, since we have an IP address, and port provided by the cached data that was read out of the MSG driver configuration.
<nds dtdversion="3.5" ndsversion="8.x"> <source> <product version="4.0.0">DirXML</product> <contact>Novell, Inc.</contact> </source> <output> <instance class-name="__DCS_REGISTRATION__" src-dn="driverDn"> <association>GUID Value</association> <attr attr-name="msgw-drv-address"> <value type="string">IP Address</value> </attr> <attr attr-name="msgw-drv-port"> <value></value> </attr> <attr attr-name="msgw-drv-context"> <value type="string"></value> </attr> </instance> </output> </nds>
Next up I want to look at the filter for this driver. There are a number of obvious classes in the filter and some non-obvious ones as well. Of course User and Group are there with the usual set of attributes. All set to synchronize on the Subscriber channel only (Ignore on Publisher channel), with the exception of DirXML-Associations, which might be so that the driver can write an association value back.
We also have dynamicGroup, which is really a special case of Groups, so much the same. There are some structural attributes like Organization, Organizational Unit, and domain so that the tree structure can be monitored as well.
Now however we get into the interesting object classes. We have the following additional classes in the filter:
These break down into two different types of objects. The ‘nrf’ named object classes are used for underlying assignment of Roles and Entitlements used in the Roles Based Provisioning Module (RBPM). The two ‘srvprv’ named attributes are used as well. In fact, the way a Role is granted in RBPM is that the User Application created an srvprvRequest object in a container the Roles and Services driver is monitoring, (It is the Requests container in the AppConfig container that is stored inside the context of the User Application driver. That is one of the reasons we need a User Application driver, it is a placeholder for the AppConfig container in eDirectory, where the Provisioning Request Definitions (PRD) and Directory Abstraction Layer (DAL) configurations are stored.)
The Roles and Services driver reads the request object and decides how to enforce the request. This results in setting attributes on the User, Group, or Container to indicate they are now members of a Role. They usually use path syntax for most of the attributes since this has the ability to have a DN reference attribute, usually pointing at the nrfRole object that is being assigned, with some XML in the string component to describe what the driver did, and a state value in the integer component. I have a bunch of trace I need to write up as an explanation with samples of how this all works under the covers.
If you happen to be having issues with the rights in User Application when you first start using it, there are a pair of issues that could be getting in your way. The Java heap for the JBoss instance might not be sufficient at startup to initiate all the Roles requests as specified by configupdate.sh. Amusingly this happens not when you first start JBoss, but rather the first time a web browser opens the User Application page, as that causes the WAR file to unpack itself. In the 3.7 version of the Roles Based Provisioning Module, reinitiating this rights generation is quite painful. (There is an attribute, full of XML, and one node needs to be removed and everything restarted) In the 4.0 version of the RBPM it is now a button in the user interface to make it redo the rights initialization process, which is a great step forward. The second possible issue is a bizarre bug in Designer where sometimes DN syntax GCV or configuration values will jump to some crazy value. This is hard to reproduce since it always works when you want it fail, and fails when you want it to work.
The second type of object is more about configuration, that is, the DirXML-Driver, which might indicate a configuration setting on the driver changed, and thus this driver should dirty and rebuild its registration cache.
I suppose a subset of that configuration case would be the objects that define the entitlements themselves. We start with the DirXML-Entitlement. This is the really low level component of a Role. Used to be you assigned entitlements. But with RBPM we move to a slightly abstracted model, where we map Resources objects (nrfResource) to DirXML-Entitlements, since as a friend said “Entitlements are computers, Resources are for People” (Thanks Mike W!). Resources give a friendlier name to Entitlements. Then a bunch of Resources get bundled up in an nrfRole. Of course Roles can have a hierarchy as well, only three levels allowed at this time though.
To keep this all sane, RBPM tracks Separation of Duties (SOD) between the lowest level so that when high level Roles are being built, you get warned that this Role would have an SOD violation if you went ahead with it. Later, if a top level role is being made that combines three other roles, each of which has 10 different entitlements, then it is pretty hard to know without detailed analysis if there are any SOD issues with this Role. However, since the low level resources know to whom they are not allowed to share with, and it is tracked, then when the engine is validating a Role you are assembling, it can look down and check for you. I think the nrfSOD object is involved in that, but I am not sure how. It might be for the case of either you choose to allow a Role to have an SOD violation, and thus track the exception with this object or else to manage defined exceptions. Looks like fodder for a whole other series of articles!
DirXML-Resource is a generic object class that uses a DirXML-ContentType attribute to indicate which type of resource object this instance of the object is, which is not the same kind of Resource that nrfResource manages. nrfResource is an object for defining a RBPM Resource object. The Mapping Table, Single Sign On (SSO) Credential Repository, SSO Application, ECMA Script, EntitlementConfiguration (which is needed to map Resources to Entitlements), raw XML, raw text, DS Object, or a new Package Prompt object are all different types of objects that use the same object class. Some of these are new to IDM 4, like the package prompt, and the Ds-object type.
The object classes, nrfConfiguration and nrfResourceAssociation I am not sure of their purpose, but you can guess from their names and allowed attributes what they are related too. Looking at the eDirectory schema, and the attributes that they contain I would say that nrfConfiguration (which differs from nrfConfig in some fashion) is directly involved in how Roles are handled. It sort of looks like there ought to be one of these per RBPM system at most. The nrfResourceAssociation (as opposed to the nrfResourceAssociations (note the plural letter s at the end) looks to be the glue that links Roles to Resources, based on the various attributes that make up the object class. There are nrfRole, nrfResource, and nrfStatus attributes which make it look like a glue object. Though it is not that common in eDirectory to see this kind of approach to connecting objects together, through a third object. Very interesting and definitely needs some investigating further.
The srvprv set of object classes are usually older User Application attributes, before the entire Role model was fleshed out fully, as it is in the IDM 4 (and RBPM 3.7 since it is mostly similar) RBPM. The srvrRbpmTeam is for handling provisioning teams, which really ought be sufficient with a group, but still have its own object class definition.
As you can see this driver is collecting a fairly diverse set of information for the Reporting module. Not just changes to User, Groups, and structural objects, but also changes to some of the configurations. Now in the world of RBPM the assignment of Roles is really more like a User, Group, or container setting, but it is important to be able to report about them.
Thus from the filter you can see that this driver is going to forward alot of different events to the shim, which is feeding that data into the Reporting modules database. Changes to Users and Groups, changes to Role definitions, changes to Role Assignments, changes to driver configuration. changes to provisioning teams, and more will get forwarded on.
In the next article in this series, I will look at an error handling rule in the Input transform and then start working through the Subscriber channel, as there is next to nothing else in the Publisher channel as this driver is more about reading events from eDirectory, and not writing much back to eDirectory.