7.4 Offline Bulkload Utility

ldif2dib utility lets you bulkload data from LDIF files to the NetIQ eDirectory database (DIB), when the eDirectory server is offline. eDirectory supports this utility on both Linux and Windows platforms. This is an offline utility and achieves faster bulkloads compared to the other online tools. The utility uses the existing directory and does not create a new database while importing entries from an LDIF file to the DIB.

ldif2dib utility is needed when you need to populate a large user database with entries from an LDIF file. Online tools such as ICE or ldapmodify are slower compared to ldif2dib due to overheads associated with online bulk load such as schema checking, protocol translation, and access control checks. ldif2dib allows for fast up time when a large user database needs to be populated and when initial down time is not an issue.

7.4.1 Improving Bulkload Performance

eDirectory provides you with new options to increase the bulkload performance.The following are the tunable parameters for bulkload performance using the NetIQ Import Convert Export (ICE) utility.

Also refer to the various operating system tunable parameters.

eDirectory Cache Settings

To optimize the bulkload performance, allocate a higher percentage of the eDirectory cache for block cache. For more details refer to “Tuning eDirectory Subsystems” in the NetIQ eDirectory Tuning Guide.

LBURP Transaction Size Setting

The LBURP transaction size sets the number of records that are sent from ICE to the LDAP server during a single transaction. Increasing this value can improve bulkload performance, assuming that you have adequate memory and that the increase does not cause I/O contention. The default transaction size is 25, which is appropriate for small LDIF files (fewer than 100,000 operations) but not for a large number of records. The LBURP transaction size can be set between 1 and 350.

Modifying the Transaction Size

To modify the transaction size, modify the required value for the n4u.ldap.lburp.transize parameter in /etc/opt/novell/eDirectory/conf/nds.conf. In ideal scenarios, a higher transaction size ensures faster performance. However, the transaction size must not be set to arbitrarily high values for the following reasons:

  • A larger transaction size requires the server to allocate more memory to process the transaction. If the system is running low on memory, this can cause a slowdown due to swapping.

  • The LDIF file should be free of errors and any entries already existing in eDirectory should be commented out. Even if a single error exists in the transaction (including cases where the object to be added already exists in the directory), eDirectory ignores the LBURP transaction setting and performs a commit after each operation to ensure data integrity.

    For more information, see “Debugging LDIF Files” in the NetIQ eDirectory Troubleshooting Guide.

  • LBURP optimization works only for leaf objects. If the transaction contains both a container and its subordinate objects, eDirectory treats this as an error. To avoid this, we recommend loading the container objects first using a separate LDIF file or enables the use of forward references.

    For more information, see “Enabling Forward References” in the NetIQ eDirectory Troubleshooting Guide.

Increasing the Number of Asynchronous Requests in ICE

This refers to the number of entries the ICE client can send to the LDAP server asynchronously before waiting for any result back from the server.The number of asynchronous requests can be set between 10 and 200. The default value is 100. Any value less than the minimum value (10) would fallback to the default. The minimum value is appropriate for small LDIF files. In ideal scenarios, a higher window size ensures faster performance. However, the window size must not be set to arbitrarily high values because a larger window size requires the client to allocate more memory to process the entries in the LDIF file. If the system is running low on memory, this can cause a slowdown due to swapping.You can modify the number of asynchronous requests in ICE using either the ICE command line option or iManager.

Using ICE Command Line Option

The number of asynchronous requests can be specified using the ICE command line option -Z. This is available as part of the LDAP destination handler.

To set the number of asynchronous requests sent by the ICE client to 50, you would enter the following command:

ice -SLDIF -f LDIF_file -a -c -DLDAP -d cn_of_admin -Z50 -w password

Using iManager ICE Wizard

To set the number of asynchronous requests sent by the ICE client through iManager:

  1. Click the Roles and Tasks button .

  2. Click eDirectory Maintenance > Import Convert Export Wizard.

  3. Type the value in the LBURP Window Size field in the LDAP Destination Handler screens in both the Importing Data from a File and Migrating Data between LDAP Servers tasks.

  4. Click Next.

    For more information, refer to the help provided in the Wizard.

Increased Number of LDAP Writer Threads

The LDAP server now has multiple writer threads. Use the -F ICE command line option for enabling forward referencing to avoid any possible errors due to concurrent processing as follows:

ice -SLDIF -f LDIF_file -a -c -DLDAP -d cn_of_admin -w password  -F

Disabling Schema Validation in ICE

Use the -C and -n ICE command line options to disable schema validation at the ICE client as follows:

ice -C -n -SLDIF -f LDIF_file -a -c -DLDAP -d cn_of_admin -w password

Backlinker

Backlinker is a background process that checks the referential integrity among other checks runs 50 minutes after the eDirectory server comes up. The subsequent time it runs is after 13 hours. Ensure that backlinker does not run during the bulkload process. In case backlinker runs, depending on the time and the number of objects loaded, backlinker can hinder the bulkload

Disabling ACL Templates

You can disable the Access Control List (ACL) templates to increase the bulkload performance. The implication of this is that some of the ACLs will be missing. However, you can resolve this by adding the required ACLs to the LDIF file or applying them later.

  1. Run the following command:

    ldapsearch -D cn_of_admin -w password -b cn=schema -s base objectclasses=inetorgperson 

    The output of this command would be similar to the following:

    dn: cn=schema
    objectClasses: ( 2.16.840.1.113730.3.2.2 NAME 'inetOrgPerson' SUP
     organizationalPerson STRUCTURAL MAY ( groupMembership $ ndsHomeDirectory
     $ loginAllowedTi meMap $ loginDisabled $ loginExpirationTime $
     loginGraceLimit $ loginGraceRem aining $ loginIntruderAddress $
     loginIntruderAttempts $ loginIntruderResetTim e $
     loginMaximumSimultaneous $ loginScript $ loginTime $
     networkAddressRestri ction $ networkAddress $ passwordsUsed $
     passwordAllowChange $ passwordExpirationInterval $
     passwordExpirationTime $passwordMinimumLength $ passwordRequired $
     passwordUniqueRequired $ printJobConfiguration $ privateKey $ Profile $ 
     publicKey $ securityEquals $ accountBalance $ allowUnlimitedCredit $
     minimum AccountBalance $ messageServer $ Language $ UID $
     lockedByIntruder $ serverHolds $ lastLoginTime $ typeCreatorMap $
     higherPrivileges $ printerControl $ securityFlags $ profileMembership $
     Timezone $ sASServiceDN $ sASSecretStore $ sASSecretStoreKey $
     sASSecretStoreData $ sASPKIStoreKeys $ userCertificate
     $nDSPKIUserCertificateInfo $ nDSPKIKeystore $ rADIUSActiveConnections $
     rADIUS AttributeLists $ rADIUSConcurrentLimit $ rADIUSConnectionHistory
     $ rADIUSDefa ultProfile $ rADIUSDialAccessGroup $ rADIUSEnableDialAccess
     $ rADIUSPassword $ rADIUSServiceList $ audio $ businessCategory $
     carLicense $ departmentNumbe r $ employeeNumber $ employeeType $
     givenName $ homePhone $ homePostalAddress  $ initials $ jpegPhoto $
     labeledUri $ mail $ manager $ mobile $ pager $ ldap Photo $
     preferredLanguage $ roomNumber $ secretary $ uid $ userSMIMECertifica te
     $ x500UniqueIdentifier $ displayName $ userPKCS12 ) X-NDS_NAME 'User' X
    -NDS_NOT_CONTAINER '1' X-NDS_NONREMOVABLE '1' X-NDS_ACL_TEMPLATES ( '2#subtree#[Self]#[All Attributes Rights]' '6#entry#[Self]#loginScript' '1#subtree#[Root Template]#[Entry Rights]' '2#entry#[Public]#messageServer' '2#entry#[Root Template]#groupMembership' '6#entry#[Self]#printJobConfiguration' '2#entry#[Root  Template]#networkAddress') )
  2. In the output noted in the previous step, delete the information marked in bold.

  3. Save the revised output as an LDIF file.

  4. Add the following information to the newly saved LDIF file:

    dn: cn=schema
    changetype: modify
    delete: objectclasses
    objectclasses: ( 2.16.840.1.113730.3.2.2 )-add:objectclasses

    Therefore, your LDIF should now be similar to the following:

    dn: cn=schema
    changetype: modify
    delete: objectclasses
    objectclasses: ( 2.16.840.1.113730.3.2.2) 
    -
    add:objectclasses
    objectClasses: ( 2.16.840.1.113730.3.2.2 NAME 'inetOrgPerson' SUP
     organization alPerson STRUCTURAL MAY ( groupMembership $ ndsHomeDirectory
     $ loginAllowedTimeMap $ loginDisabled $ loginExpirationTime $
     loginGraceLimit $ loginGraceRem aining $ loginIntruderAddress $
     loginIntruderAttempts $ loginIntruderResetTime $
     loginMaximumSimultaneous $ loginScript $ loginTime $
     networkAddressRestri ction $ networkAddress $ passwordsUsed $
     passwordAllowChange $ passwordExpirationInterval $
     passwordExpirationTime $ passwordMinimumLength $ passwordRequired
     $passwordUniqueRequired $ printJobConfiguration $ privateKey $ Profile $ 
     publicKey $ securityEquals $ accountBalance $ allowUnlimitedCredit $
     minimum AccountBalance $ messageServer $ Language $ UID $
     lockedByIntruder $ serverHolds $ lastLoginTime $ typeCreatorMap $
     higherPrivileges $ printerControl $ securityFlags $ profileMembership $
     Timezone $ sASServiceDN $ sASSecretStore $ sASSecretStoreKey $
     sASSecretStoreData $ sASPKIStoreKeys $ userCertificate $
     nDSPKIUserCertificateInfo $ nDSPKIKeystore $ rADIUSActiveConnections $
     rADIUSAttributeLists $ rADIUSConcurrentLimit $ rADIUSConnectionHistory $
     rADIUSDefa ultProfile $ rADIUSDialAccessGroup $ rADIUSEnableDialAccess
     $rADIUSPassword $ rADIUSServiceList $ audio $ businessCategory $
     carLicense
     $ departmentNumbe r $ employeeNumber $ employeeType $ givenName $
     homePhone $ homePostalAddress  $ initials $ jpegPhoto $ labeledUri $ mail
     $ manager $ mobile $ pager $ ldap Photo $ preferredLanguage $ roomNumber
     $ secretary $ uid $ userSMIMECertifica te $ x500UniqueIdentifier $
     displayName $ userPKCS12 ) X-NDS_NAME 'User' X-ND S_NOT_CONTAINER '1' X
    -NDS_NONREMOVABLE '1')
  5. Enter the following command:

    ldapmodify -D cn_of_admin -w password -f LDIF_file_name

    For more information on working with ACLs, refer to the NetIQ eDirectory Tuning Guide.

Enabling/Disabling Inline Cache

You can enable or disable the Inline Change Cache for a server. You can disable Inline Change Cache only when Outbound Synchronization is disabled. Enabling Outbound Synchronization also enables Inline Change Cache.Disabling Inline Change Cache marks the change cache as invalid for this replica and tags it with an invalid flag in Agent Configuration > Partitions. Enabling Inline Change Cache removes the invalid change cache flag when the change cache is rebuilt.

Increasing the LBURP Time Out Period

By default, the time out period for a client is 20 minutes (1200 seconds). But during bulkload, with the LBURP transaction size as high as 250, objects with large number of attributes with huge values for these attributes, and with LBURP concurrent processing enabled at the server, the server gets busy processing data pumped in by the ICE client without responding to the client in the stipulated time. This times out the ICE client

Therefore, we recommend you to increase the time out period. You can do this by exporting the environment variable LBURP_TIMEOUT with high values (in seconds).For example, to export the LBURP_TIMEOUT variable with 1200 seconds, enter the following: export ICE_LBURP_TIMEOUT=1200

7.4.2 Using ldif2dib for Bulkloading

You can specify the LDIF file containing the data to be imported and the path to the database files where data needs to be imported through the command line interface. Using ldif2dib to bulkload data requires the following steps:

  1. Take a backup of the DIB.

    For more information on the backup and restore process, see Section 15.0, Backing Up and Restoring NetIQ eDirectory.

  2. Stop the eDirectory server.

  3. To start bulkloading from the LDIF file, enter the following at the command prompt:

    ldif2dib <LDIF File Name> [Options]

    Where

    • LDIF File Name: Specifies the name of LDIF file to bulkload.

    • Options: These are optional and specify the different parameters that you can use for tuning this utility. The options supported by the ldif2dib utility are listed below:

    For example, if you want to set the options for specifying batch mode, cache size and block cache percentage options, enter the following command:

    ldif2dib 1MillionUsers.ldif -b/novell/log/logfile.txt -c314572800 -p90

    HINT:You can temporarily suspend the bulkload by pressing the s/S key. The Escape key (Esc) can be used to stop the bulkload.

7.4.3 Multiple Instances

ldif2dib can be used to bulkload entries from LDIF files to a particular instance of eDirectory (DIB) by specifying the location of its nds.db file with the -n option. If the location of the nds.db file is not specified with the -n option and if there is a single instance of eDirectory configured on the system, ldif2dib automatically detects the location of its database files. However, if there are multiple instances, ldif2dib displays a menu listing all configured instances and allows you to choose an instance for bulkload.

For more information on the multiple instances of eDirectory, see Using ndsconfig to Configure Multiple Instances of eDirectory 9.0 in the NetIQ eDirectory Installation Guide.

7.4.4 Tuning ldif2dib

This section contains information about the parameters that can be used to tune ldif2dib:

Tuning the Cache

The database cache setting is one of the more significant settings that affects the eDirectory performance. If it is set too low, eDirectory operations slow down because information must be retrieved from the disk more often. If it is set too high, enough memory is not available for other processes to run and the whole system slows down. For more information on cache, see Modifying FLAIM Cache Settings in the NetIQ eDirectory Tuning Guide.

Bulkload performance generally increases on increasing the cache size. However, no performance improvement has been observed by increasing the cache size beyond a value which is 3.8 times the size of the LDIF file.

Transaction Size

The transaction size defines the chunk size in terms of number of objects per transaction. When the transaction size is high, a small number of large chunk writes result and when it is low, a large number of small chunk writes result.

The bulkload performance increases with higher transaction sizes. A transaction size of zero results in a special case which allows unlimited objects per transaction. When the transaction size is zero, the performance is high because the commit is done at the end of the bulkload. However, we do not recommend you to set the transaction size to 0 for very large LDIF files (larger than one million objects). You can set the transaction size as high as 4000 for very large LDIF files.

Index

Although use of indexes leads to a higher search performance, it makes bulkload slower because indexes need to be updated for every object loaded to the DIB. This is especially true for substring indexes. Therefore when you are bulkloading large number of objects, you can suspend indexes to speed up the bulkload. The indexes are automatically resumed when eDirectory server is brought up. Use the -x option to disable indexes before loading entries using ldif2dib.

Block Cache Percent

If the sub-string indexes are enabled for attributes, it is recommended to set the block cache percent to 50%, and if the sub-string indexes are disabled for attributes, you can set the block cache percent to 90%.

Check Point Interval

Checkpoint interval is the time for which the database waits before it initiates the checkpoint background thread which brings the on-disk version of the database up to the same coherent state as the in-memory (cached) database. This check point thread flushes the dirty cache to the disk, followed by cleaning up the roll forward log. Since bulkload is temporarily suspended while check point thread runs, we recommend that you set the check point interval to a high value to achieve faster bulkloads.

7.4.5 Limitations

This section contains limitations of the ldif2dib utility:

Schema

  • The LDIF file should mention all the object classes that an entry belongs to. An entry can belong to multiple object classes because of inheritance. For example, an entry of type inetOrgPerson should have following syntax in the LDIF file:

    objectclass: inetorgperson
    objectclass: organizationalPerson
    objectclass: person
    objectclass: top
  • Currently, following syntaxes are not supported:

ACL Templates

ACLs that are specified in the ACL templates for an object class, are not automatically added for objects bulkloaded using ldif2dib.

Options

On Linux, if the -b option is used, the screen that displays statistics disappears after the bulkload is complete. The final statistics, however, are written to the log file for reference.

Simple Password LDIF

On Windows, while uploading LDIF having simple password, ldif2dib might fail if the NICI keys in system and Administrator folder are not in sync. To work around this issue, access the keys present in the nici/system folder as follows:

  1. Go to the C:\Windows\system32\novell\nici\folder.

  2. Backup the files present in the Administrator folder.

  3. Get access to the system folder and its files by following the below mentioned steps:

    1. Go to the Security tab in the Properties window of the system folder.

    2. Select Advanced Options and go to Owner tab.

    3. Select Administrator.

    4. Go back to the Security tab and add Administrator to the list.

      Repeat the similar steps to get read access to all the files present inside the system folder.

  4. Overwrite the files in the Administrator folder with the ones in the system folder.

  5. Once the upload is done, copy the backed up files to the Administrator folder.

  6. Revert the Administrator's access to the system folder and also the files within the folder.

Custom Classes

Bulkloading an LDIF with a large number of container objects using ldif2dib can result in a memory build up leading to a -150 error being reported.

Filtered Replicas

eDirectory does not support bulkloading operations to filtered replicas.

7.4.6 Caveats

Behavior of ldif2dib is undefined in the following scenarios:

Duplicate Entries

Uploading LDIF files having duplicate entries or having entries already present in the DIB, without the -u option would cause the entry to be added more than once, leading to an inconsistent state of the DIB. So if you are not sure if entries are repeated in the LDIF or if they are present in DIB before the bulkload, use the -u option during bulkload.

No Schema Checks

ldif2dib does not perform any schema checks. As a result, you can add an attribute to an object even if the attribute does not belong to the schema of the object. This would leave the DIB in an inconsistent state. Use ldif2dib only when you are sure that the LDIF data does not need schema checks.

Insufficient Space on Hard-Drive

Behavior of ldif2dib is undefined when there is not enough space on the hard-drive for all the objects being loaded. You need to make sure that there is sufficient space for all the objects before starting the bulkload.

Forced Termination

Forcefully terminating the ldif2dib process can leave the DIB in an inconsistent state. Use the Escape key to gracefully exit the bulkload.

Terminal Resizing

Resizing the terminal during bulkload can distort the statistics displayed on the user interface. Terminal resizing should be avoided while bulkload is in progress.