JDBC driver, using DOS syntax on Linux for the state file.
The Novell Identity Manager driver for JDBC databases is a very powerful driver. Part of its power is the wide range of possible solutions it includes.
It is actually an excellent example of how to use configuration values to let one driver configuration file handle many possible solutions. The driver can connect to many different databases over the JDBC protocol. Not every database is tested and supported but all the main big name databases are. (Oracle, Microsoft SQL Server, IBM DB2, Postgress, MySQL, and others), and those that are not tested, often work pretty well regardless.
In addition to the various databases the driver can connect too, there are many implementation specific options to consider. Things like how you determine the next primary key value when inserting a row into a table, (aka an Add event on the Subscriber channel.)
Additionally the driver can run local, in the Identity Manager instance, which is running inside the eDirectory instance, or it can run as a remote loader on the database server (or even on a third other server).
On top of all that the driver can run in two fundamentally different models. Triggered and triggerless. Triggered was the only mode previously that used a staging table, with triggers to watch any tables of interest and copy the data to the staging table. Then anything that gets written into the staging table gets sent as an event on the Publisher channel. A fairly traditional approach to a database problem. The problem is that it requires changes in the database (addition of triggers) and some database vendors may have issues if you add triggers to their product. We recently ran into this issue, where adding such triggers made the product unsupportable according to the vendor. No matter how much the client wanted to use it as the authoritative source for their identity solution, violating the support contract for it was not a direction they were willing to go in.
In triggerless mode it is a lot less intrusive upon the database. The driver queries the timestamps of all the columns, stores it in a state file and compares the values to the last value after which it can tell if anything has changed. Once a change is detected it is sent down the Publisher channel via the shim as an event.
A typical timestamp query looks something like for a database with two tables being monitoring, IDM.CLIENTS and IDM.WORK_ORDER.
DirXML: [10/15/07 12:09:23.02]: TRACE: Polling for database events... DirXML: [10/15/07 12:09:23.02]: TRACE: Updating state for class 'IDM.CLIENTS'. DirXML: [10/15/07 12:09:23.04]: TRACE: SELECT A.PK_SEQUENCE, A.CLIENT_ID, A.FIRST_NAME, A.MIDDLE_INITIAL, A.LAST_NAME, A.PHONE, A.EXT, A.SEQ_DEPARTMENT, A.MAIL_DROP, A.BUILDING, A.ROOM, A.ADDRESS, A.CITY, A.ZIP, A.SEQ_COMPANY, A.EMAIL_ADDRESS, A.BEEPER, A.PICTURE, A.STATE, A.CLIENTTEXT01, A.COUNTRY, A.WINDOWS_USER_ID, A.TITLE, A.ADDRESS2, A.NOVELL_ID, A.NEXTEL_RADIO_NUMBER, A.CONTRACT_END_DATE, A.DIRECTORS, A.EMPLOYEESTATUS1, A.RANK1, A.EMPLOYEETYPE1, A.NOTE, A.NOVELL_CONTEXT2, A.CSGUSER, A.FULL_NAME, A.CHARGE_BACK_ID, A.CHARGE_BACK_NAME, A.COMPANY_ID, A.COMPANY_NAME, A.DEPARTMENT_NAME, A.DIRECTORFIRSTNAME, A.DIRECTORLASTNAME FROM IDM.CLIENTS A ORDER BY A.PK_SEQUENCE ASC DirXML: [10/15/07 12:09:57.57]: TRACE: Updating state for class 'IDM.WORK_ORDER'. DirXML: [10/15/07 12:09:57.94]: TRACE: SELECT A.PK_WORK_ORDER_NUM, A.STATE, A.SEQ_COMPANY, A.SEQ_DEPARTMENT, A.SEQ_CLIENT, A.ACMEOPID, A.ACMEPROFILE1, A.ACMEPROFILE2, A.ACMEPROFILE3, A.ACMEPROFILE4, A.ACMEPROFILE5, A.WODIRECTORS, A.ACMEPROFILE6, A.ACMEPROFILE7, A.ACMEPROFILE8, A.ACMEPROFILE9, A.ACMEPROFILE10, A.ACMEPROFILE11, A.ACMEPROFILE12, A.ACMEPROFILE13, A.ACMEPROFILE14, A.ACMEPROFILE15, A.INCIDENT_DESCRIPTION, A.THINCLIENTPROFILE, A.ACMEDIVISIONS, A.ACMESUPERVISORS, A.ACMELNAME, A.ACMEFNAME, A.ACMEPHONE, A.ACMEEXPIREDATE, A.LAST_NAME, A.FIRST_NAME, A.CLIENT_ID, A.CL_EXT, A.COMPANY_NAME, A.COMPANY_ID, A.NOVELLCTEXTCDE, A.SUBJECT_ID, A.SUBJECT_DESCRIPTION FROM IDM.WORK_ORDER A ORDER BY A.PK_WORK_ORDER_NUM ASC
These queries are getting the data that is needed to keep the state file up to date, and allow it to detect changes.
One of the big issues with triggerless implementations is that if anything happens to invalidate the state file, the driver will need to resynchronize all objects in its view. Many things can invalidate the state file. Sometimes it gets too large and the shim will somehow decide to recreate it from scratch. (There was a patch that helps this issue a lot so it should not be a big deal anymore). If you change the filter, which controls the list of columns that need to be watched for events, then the state file needs to be recreated. This is really really frustrating during development as each time you realize you need to monitor a new column, it causes a resync event as the state file is invalidated.
The easiest way to manually cause a resync to happen on purpose would be to delete the current state file, or tell the driver configuration to use a different directory for it. The state files location is a setting in the driver configuration, and the value needs to be valid in the context of where the driver is running. For example, on a Linux or Unix host running the shim, you would want a Unix like syntax. /data/idm/state for example. On a Windows host you would want a DOS like syntax, say d:\novell\jdbc\state and on Netware you would want a Netware style syntax, say data:\idm\jdbc\state
What can complicate this decision, is the location of the driver shim. If the Identity Manager engine is running on Netware and the database is running on Windows, but the driver is running as a remote loader, and the Remote loader is running on AIX (IBM’s Unix like operating system), then the path to the state file would be in Unix style syntax, since the Java based shim that needs access to the state file is running on a Unix like operating system.
If the Identity Manager engine is running on Windows, and the driver is running local, then you would specify DOS style syntax for the path to the driver.
What gets amusing is what happens if you started with the driver on a Windows based Remote loader, and then you had the path set to c:\novell and then for other reasons you moved the driver to run local on the Identity Manager engine, which in this example is running on Suse Linux, but forget to change the path to the file.
Then to make it worse, after changing the path, you do not see a resync event, so you look on the server that used to be running the Remote Loader, see that the files haven’t been touched in months, so you set the path to c:\novell\state and restart.
Clearly c:\novell\state makes no sense what so ever to a Linux server as a path to a file. The driver, since it has no absolute path (i.e. no leading forward slash (/) to reset the path to the root of the file system), starts with the DIB directory (the location for the eDirectory database to be stored) to try and parse the value provided in the config, no matter how crazy it might look to it.
There is no standard location for the eDirectrory DIB directory location on Linux, since you can put it anywhere that makes sense when you install it. The easiest way I know of to find it for certain takes a couple of easy steps.
Use ndsmanage to show the variety of eDirectory instances running on server. (Remember that as of eDirectory 8.8, you can run multiple instances on one device, so the tools are pretty good at handling the multiple instance model). It will show you the set of eDirectory instances (I have run as many as 8 instances simultaneously on a SUSE Linux server (ironically itself a Virtual Machine inside VMWare) The trick in that case is get a different IP address bound to the multi homed network interface for each eDirectory instance) on the server. If there is only one, it is pretty simple. In either case (single instance or multiple instance) one of the descriptive things about the instance will be the .conf file in use, along with its path.
The reason to look this way, via ndsmanage is that the conf file really has no standard location, it could be in /etc/opt/novell/eDirectory/ndsd.conf or it could be in /home/bobsmith/stuff/Novell/eDir.conf or anywhere else in the filesystem. If you look at the contents of the conf file, you will see a setting value:
locsles10gw:~ # /opt/novell/eDirectory/bin/ndsmanage
Instances management utility for Novell eDirectory 8.8 SP 1 v2
The following are the instances configured by root
 /root/instance/nds.conf : .ACMESMSLES10FS1.SERVERS.LAB.ACME-LAB. : 10.1.1.135@524 : ACTIVE
 /root/instance2/nds.conf : .NYCSLES10LAB1.SERVERS.VPN-LAB.ACME-VPN-LAB. : 10.1.1.137@524 : ACTIVE
Enter [r] to refresh list, [1 - 2] for more options, [c] for creating a new instance or [q] to quit:
locsles10gw:~ # less /root/instance/nds.conf
https.server.cached-cert-dn=SSL CertificateIP – ACMESMSLES10FS1.SERVERS.LAB
And there it is: n4u.nds.dibdir=/root/instance//data/dib
Take a look at that path in /root/instance/data/dib and there you will see the following directories:
drwxrwxrwx 2 root root 752 Jun 5 18:12 c:\novell
drwxr-x–x 2 root root 576 Jun 6 16:40 c:\novell\state
Kind of neat, c:\novell on a Linux server. Thankfully it is pretty easy to clean up the directories, just enclose the name in a double quotes (“) when you use rm or rmdir on them and it will work.
The directory will look something like:
locsles10gw:~/instance/data/dib # cd “c:\novell\state”
locsles10gw:~/instance/data/dib/c:\novell\state # dir
-rw-r—– 1 root root 2842624 Jun 6 16:40 jdbc_cea8d200-f83f-01db-8030-050002000000.db
-rw-r—– 1 root root 3648010 Jun 6 16:42 jdbc_cea8d200-f83f-01db-8030-050002000000.lg
-rw-r—– 1 root root 0 Jun 6 16:40 jdbc_cea8d200-f83f-01db-8030-050002000000_0.db
-rw-r—– 1 root root 66208 Jun 6 16:40 jdbc_cea8d200-f83f-01db-8030-050002000000_0.lg
-rw-r—– 1 root root 0 Jun 6 16:40 jdbc_cea8d200-f83f-01db-8030-050002000000_1.db
-rw-r—– 1 root root 66208 Jun 6 16:40 jdbc_cea8d200-f83f-01db-8030-050002000000_1.lg
Now that you have found it, here is what is one of the errors that you can get when there is an issue with the State file and the driver decides it is time to rebuild it. Usually this is pretty hard to catch in the logs, since a resync event, which is what immediately follows this error, generates a huge amount of trace, and unless you have you trace files set to a huge value it usually will scroll out.
I was pretty happy to catch this going by. This is one of those errors you almost never get to see, since the state file goes bad at pretty much indeterminate times, and I actually saw this one happen, in a tailed log file, and was able to snag the error message. The guys I work with thought it was pretty funny how excited I got over a silly error message, but you have to admit, this is a hard error to catch!
<nds dtdversion="2.0" ndsversion="8.x" xmlns:jdbc="urn:dirxml:jdbc"> <source> <product build="20070918_0743" instance="MAGIC-JDBC" version="3.5.2">DirXML Driver for JDBC</product> <contact>Novell, Inc.</contact> </source> <input> <status level="warning" type="driver-general"> <description>Triggerless publication state file corruption detected: java.lang.IllegalStateException: This iterator has expired. The backing tree has been reloaded from disk.. Corrupted file 'c:\novell/jdbc_cea8d200-f83f-01db-8030-050002000000' has been archived for debugging purposes. Intiating a resync of all objects.</description> <jdbc:exception jdbc:class="java.lang.IllegalStateException"> <jdbc:message>This iterator has expired. The backing tree has been reloaded from disk.</jdbc:message> <jdbc:stack-trace>java.lang.IllegalStateException: This iterator has expired. The backing tree has been reloaded from disk. at com.novell.nds.dirxml.driver.jdbc.util.jdbm.JDBMTree$Iterator.checkState(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.util.jdbm.JDBMTree$Iterator.hasNext(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.util.state.StateMediator.advanceDirectoryPointer(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.util.state.StateMediator.stringImplementation(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.util.state.StateMediator.startImpl(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.util.state.StateMediator.start(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.JDBCTriggerlessPublicationShim.waitForEvents(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.JDBCPublicationShim.pollImpl(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.JDBCPublicationShim.pollLoop(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.JDBCPublicationShim.start(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.JDBCPublicationProxy.start(Unknown Source) at com.novell.nds.dirxml.engine.Publisher.run(Publisher.java:388) at java.lang.Thread.run(Unknown Source) </jdbc:stack-trace> </jdbc:exception> </status> </input> </nds>
There was a patch to jdbm.jar issued a while back now, that fixes some issues with the state file growing and growing and growing. I know in a test system I was working on, we had about 10,000 users with maybe 30 rows per column, and one other empty table being synced. With a fresh state file, it needed about 1.5 Megs of space for the timestamps. But we watched it grow beyond 500 Megs before the patch.
With the patch it seems to stay below 200 Megs. Still not good enough, but every little step helps.
Of course a consequence of the state file getting too big can be the following error:
<nds dtdversion="2.0" ndsversion="8.x" xmlns:jdbc="urn:dirxml:jdbc"> <source> <product build="20070918_0743" instance="MAGIC-JDBC" version="3.5.2">DirXML Driver for JDBC</product> <contact>Novell, Inc.</contact> </source> <output> <status level="fatal" type="driver-general"> <description>Unable to archive/open triggerless publication state file 'jdbc_cea8d200-f83f-01db-8030-050002000000': java.io.FileNotFoundException: c:\novell\state/jdbc_cea8d200f83f-01db-8030-050002000000.db (No space left on device)</description> <jdbc:exception jdbc:class="java.io.FileNotFoundException"> <jdbc:message>c:\novell\state/jdbc_cea8d200-f83f-01db-8030-050002000000.db (No space left on device)</jdbc:message> <jdbc:stack-trace>java.io.FileNotFoundException: c:\novell\state/jdbc_cea8d200-f83f-01db-8030-050002000000.db (No space left on device) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.<init>(Unknown Source) at java.io.RandomAccessFile.<init>(Unknown Source) at jdbm.recman.RecordFile.<init>(RecordFile.java:98) at jdbm.recman.RecordManager.<init>(RecordManager.java:98) at com.novell.nds.dirxml.driver.jdbc.util.jdbm.JDBMFile.setRecMan(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.util.jdbm.JDBMFile.open(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.util.state.StateMediator.start(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.JDBCTriggerlessPublicationShim.waitForEvents(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.JDBCPublicationShim.pollImpl(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.JDBCPublicationShim.pollLoop(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.JDBCPublicationShim.start(Unknown Source) at com.novell.nds.dirxml.driver.jdbc.JDBCPublicationProxy.start(Unknown Source) at com.novell.nds.dirxml.engine.Publisher.run(Publisher.java:388) at java.lang.Thread.run(Unknown Source) </jdbc:stack-trace> </jdbc:exception> </status> </output> </nds>
We ran out of disk space, since we had all the drivers tracing, up to a full giga byte of trace per driver. At one point, we had a lot of disk space, but alas all that trace ate it up pretty quick.
Anyway, hopefully that will help with some troubleshooting around the state files in a triggerless implementation of the Identity Manager JDBC driver for databases.