I was sitting at Brainshare 2008, watching a Troubleshooting Identity Manager session. On screen, with a live example and trace files, the presenter was showing how to use DSTRACE to troubleshoot a problem.
As I was trying to think of how to do this in an article, I realized I would need to include huge amounts of trace file in the article, then dissect it line by line. I suspect that would bore most people, so instead, I will recomend an article I wrote that did that, as an implicit example of how to do this: http://www.novell.com/communities/node/1474/password-transformation-rule-sets
Instead, I would like to talk about general tips and tricks for reading a trace, to minimize the amount of trace included in the article.
Tips and Tricks
There are a couple of things to keep in mind that may help make sense out of what at first may look like a total mishmash of overwhelming logging:
1. Separate things as much as possible. If you look at the trace from iMonitor, DSTRACE.NLM (on Netware), ndstrace (on Unix like operating systems, Linux, AIX, HPUX, Solaris). or dstrace.dlm (on Windows), and you have more than one driver on that server, then you will see the output from both drivers intermingled.
2. Each line will usually be prefixed by a timestamp, then a driver name, finally a PT or ST for the Publisher Channel Trace, or Subscriber channel Trace. So in principle you could read through and only look for the driver of interest to track down. In practice this is a total pain, and not worth the effort. Instead, what you should do for each driver, is to go to the Misc tab of the Driver properties in iManager with the Identity Manager plugins, and set a trace level that is appropriate (usually 3), and most importantly, specify a path to store the logs for this driver.
The reason this is useful is that you will get a set of logs, as big as you specify the maximum size to be, for JUST this driver. If you do this for each driver in the driver set, it means you will be able to more easily read the trace to find your issue. The trace file will actually grow to one tenth the size you set, and there will be ten such files, making it easier to manage the files. Even on Linux, reading and searching 100 Meg text files can slow a machine down. (Windows suffers a lot more with most editors, but even Linux is not immune).
Now that you have a single drivers trace file to work with, the task is much easier, since there are no other drivers interleaved within the trace. Alas, there is still an issue. Events happen on both channels in most drivers, the Subscriber (from eDirectory to the connected application) and the Publisher (events from the connected application flowing into eDirectory). So you will have to watch out for that intermingling of events if something is happening on both channels at the same time.
Reading a Trace Line
Here is an example snippet of a single line from a trace.
03:06:28 9327BBA0 Drvrs: SolarisV PT: Action: do-if().
We see the time stamp as 3:06AM (this was on a VMWare instance, and the darn time kept drifting, a LOT! But since it was in a test lab, we did not really care). Next we have 9327BBA0, which looks like an object or event ID – I am not sure which – and we mostly ignore it.
Then we have Drvrs: which tells us this is coming from the DRVR filter flag that we enabled in DSTRACE. The name of the driver is next – SolarisV. If the driver has an inconviently long name, you can make this easier to read by specifying a name to appear in trace, while setting the trace level and trace directory as mentioned above. Each driver will show its name so that you can identify it in trace, but if you are doing as suggested and tracing each driver to a file, it should only show one name ever, per file.
The channel that the event is running on, PT: is important, as it tells us what the perspective we are taking on the event, and finally we have the actual event being traced, in this case a pretty boring action, testing an IF condition as: Action: do-if().
So far so good, this looks pretty easy. Now where things get confusing is that a rule in a policy can cause all sorts of different things to happen.
Dealing with Rules
If, for example, the condition test in your rule requires that a source or destination attribute be available, then you will see a <query> event go out, inside an <input> node. The thing that gets confusing is that the current processing of the rule pauses while the Query is being made for that attribute. The Input document with the query itself goes through the entire channels rule set, hits the end, and then returns an <output> document with the value of the results (or an empty result set). That Output document is processed through the entire rule set as well, so it can look like suddenly your rules started over again at the beginning. This is probably the most confusing part of reading a trace. You are following along, and suddenly this new query in an input document comes along, and the rules start at the beginning again.
Here’s the trick: when you see an <input> documet with a <Query> node or the like, start paging down until you find the <output> document. For interest, see what is returned – a value or nothing at all – and then keep going till something happens after the <output> document is finished.
You will see this in the Matching rules, since the matching rule almost by definition needs to query for a matching user in the system at the other end of the channel. If it returns nothing, you will usually see a trace line that says No Match found, or possible Match found if there was a match.
Any time you use the Source Attribute or Destination attribute the engine needs to query in the appropriate direction. As a side note, if you know the attribute will always be in the document, use Operational Attribute instead. If it never will be in the document generating the event, use Source Attribute. If you think it might sometimes be there, and sometimes not, use Attribute – which is smart enough to use the value in the document, and if it is not there, to query the source.
Of course, there are a number of other tokens that will always cause a query, such as the Query token in Arguement builder, or the Unique Name token – and many others, I am sure. Once you learn to skip through these quickly, you will be much more adept at tracing. The easiest way is just skip down a page at a time until you see the last <output> document, and then the rules suddenly continue where you left off.
The overall flow of a trace file is that an event happens, either in eDirectory or in the connected system, and is passed to the shim or the engine. For simplicity, I will only look at the Subscriber channel, since trying to qualify every comment with the appropriate wording for each channel will get really wordy. In the trace we will see our event, usually with a ST: flag, and a comment perhaps of “Document received”.
Then we will see the XDS document that describes the event, which looks something like this:
<source> <product version="220.127.116.1170411 ">DirXML</product> <contact>Novell, Inc.</contact> </source> <input> <add allow-adminp-support="true" certify-user="true" class-name="User" event-id="TESTACME1#20070628194836#1#2" src-dn="\TIDM\US\ACMEC\idmusers\UKane" src-entry-id="33552"> <add-attr attr-name="CN"> <value naming="true" timestamp="1183060114#27" type="string">UKane</value> </add-attr> <add-attr attr-name="Given Name"> <value timestamp="1183060114#5" type="string">Um</value> </add-attr> <add-attr attr-name="nspmDistributionPassword"><!-- content suppressed --> </add-attr> <add-attr attr-name="Surname"> <value timestamp="1183060114#7" type="string">Kane</value> </add-attr> <add-attr attr-name="uniqueID"> <value type="string">UKane</value> </add-attr> <add-attr attr-name="MailFile"> <value>UKane.nsf</value> </add-attr> <add-attr attr-name="Internet EMail Address"> <value>UKane@adtest.com</value> </add-attr> </add> </input> </nds>
Then the rules, as defined in your driver, start to process on the event, sequentially. The first rule in the Event transformation tests its conditions. If any of them pass, the rule performs the specified actions (often to Veto the operation).
After each Rule in your driver definition is processed, the resulting event document will be displayed in the trace. Thus you can see the before, the actions, and the after document. You can see what it looked like before the rules did their stuff and what it looks like after, and decide if that was what you intended. This makes it look like a lot more trace, as many rules that do not really do much will cause the event document to be displayed again, exactly the same as the previous version.
The next big thing to watch out for in trace is a type of event flow that can be quite confusing. If a Modify event occurs, and we get an XDS document for it, but the object is not associated, then something different may occur. An associated object usually looks like it has an <association> tag, with a value of the association value, like <association state=”associated”>SOME ASSOCIATION VALUE</association>. This type of event does process through the Event Transformation rule, but once it gets past it, it hits a decision point and becomes a synthetic add. There will actually be a line in trace that says “Synthetic Add,” so you can search for that and find where it is happening.
Once a modify event is converted to an add event, the driver will run it through the Matching rule. If no match is found, it will run through the Create and Placement rules as appropriate. This can be confusing to watch, as a Modify event suddenly starts acting like an Add event. More importantly, if you have rules in the Event transformation watching for a Modify to do something with it, they will probably not block this type of event, and it is best to place them in the Command Transformation rule instead.
This continues down the rules until finally, you will see a line Pumping XDS and that is point at which the document is handed off to the driver shim. To continue tracing this, you would need to look at the driver shim’s trace output, which will do its magic and return a status document.
In our trace example, we will see an <output> document returned with a <status> node with a level of success, retry, warning, or fatal. If there was operational data on the original data, the engine will add it back to the output document.
The more you practice and try reading trace output, the better you will get. I find it easiest to read trace on a UNIX operating system, using tools like “less” and “grep”. There are a lot of powerful abilities buried in those tools; I find them easier to use than a GUI tool like Wordpad on Windows.
I hope this has been helpful in discussing the reading of an Identity Manager trace file. I think the next thing to do is to try to find a simple trace example to walk through, line by line. As I said above, the example of http://www.novell.com/communities/node/1474/password-transformation-rule-sets link is a pretty good example of doing this.