4.5 Understanding Collectors

Identity Governance provides templates to simplify the collection of data. Collection templates or collectors are the default mappings of identity, account, or permission data from identity and application sources to the core Identity Governance schema. Each collector has one or more views that allow you to specify which data you will collect from your identity or application source, and describe how that data will be linked together in the catalog. Each collector has one or more views that describe the characteristics of the data source that you could collect. The views are different for identity and application sources. For example, the JDBC Identity (Oracle) collector template can collect data for users, groups, group-to-group associations, and group-to-user associations. Collectors for application sources gather either account or permission data.

4.5.1 Understanding Collector Configuration

Identity Governance provides a large set of collector templates that contain default data and configuration settings for many common enterprise and cloud data sources. Every collector has the following common configuration elements:

Collector template

Collector templates include predefined attribute mappings and value transformation policies for specific data source types. Select a template that best suits the data source. For example, select AD Identity to collect identities from Active Directory. The templates support the following types of data sources:

  • Active Directory

  • Azure Active Directory

  • CSV file

    The CSV collector supports TSV file. You enter the word tab, in uppercase, lowercase, or any combination in the Column Delimiter field. To collect from a CSV file, you must specify the full path to the file.

  • eDirectory

  • Google Apps

  • Identity Manager

  • JDBC, such as Oracle or PostgreSQL

  • Resource Access Control Facility (RACF)

  • Salesforce.com

  • SAP HR

  • SAP User Management

  • ServiceNow

  • SharePoint

NOTE:Template names that end in with changes can be enabled for processing incremental change events.

To see all the data source types, select Collector Template when you create the data source.

To collect data from a JDBC or SAP source, Identity Governance needs the appropriate third-party connector libraries to be installed on the Identity Governance server. For more information, see Identity Governance Server System Requirements in the Identity Governance 3.6 Installation and Configuration Guide.

You can also customize an existing template or create your own. For more information, see Section 3.2, Customizing the Collector Templates for Data Sources.

Service Parameters

These are the configurable parameters that allow the collector to connect and, if required, authenticate to the target data source. These typically include file locations, server host and port specifications, or service URLs. This section includes a Test connection button to verify the settings.

Test Collection and Troubleshooting

This option allows you to preview data before running a full collection, preserve the configuration for a data source, or create an emulation package for a data source. You can use generated files to validate and troubleshoot collections, send results to support engineers, and to import data source configurations to a different environment.

Transformation Scripts

This view in the collector template allows you to view transformation script usage information.

For more information about configuring specific data source collector templates, see:

4.5.2 Transforming Data During Collection

Because each application might have its own format for the data that you plan to collect, you might need to transform the data during the data collection process. For example, the application might store dates as a string (20151202) that needs to be converted to the Identity Governance date format, which is the Java Date format in milliseconds. Also, an application might use field lengths that do not match the field length in Identity Governance. These variations in collected data affect your ability to use the data or merge it with data collected from other sources.

Transformation scripts may be added to any mapped data field in any data collector by clicking on the ‘{}’ icon next to the field mapping. This will expand the dialog to allow you to either upload a transformation file or paste in transformation text. If required, you can also delete a transformation script after removing all references to the script from the attribute mapping(s) that use it.

The transforms are done through Nashorn-compatible Javascript. Within the Javascript, you can access the collected value by creating a variable name inputValue. After manipulating the collected value, you can return the value to Identity Governance by assigning the value to a variable name outputValue.

The following example translates the values true and false from the connected system to active and inactive in the Identity Governance catalog.

if (inputValue == 'true') {
    outputValue = 'active';
}
else {
    outputValue = 'inactive';
}

To add or delete a transformation script:

  1. Log in as a Global or Data Administrator.

  2. Select a configured data source, and then expand a collector view to view related attributes.

  3. Click ‘{}’ icon next to the field mapping to add a script.

    or

  4. Delete a script.

    NOTE:You must remove all references to the script from the attribute mappings to delete a script.

    1. Expand the Transformation scripts view of the data collector to see its usage.

    2. Expand the collector view(s) mentioned in the usage information.

    3. Click ‘{...}’ icon next to the field mapping and choose Select a script... to clear the script usage from the attribute mapping.

    4. Repeat the above step to remove all usage of the script.

    5. Expand the Transformation script view and select the delete icon to delete the script.

For more information about transformations, see Collected Data Transformations reference.

4.5.3 Testing Collections

When creating, updating, or troubleshooting data collectors, you can test all or part of the collections without publishing the results to the catalog. When you test a collection, you either ensure that the collector is correctly configured, or you have the ability to change the collector configuration and quickly test again to check the results.

You can view the collected data as soon as the test collection completes, or you can download the results to view later. Results of test collections remain available in Identity Governance until you delete them.

When you run a test collection, you have some options for the test data:

  • All records

  • Some records

    When you select a subset of records to collect, you cannot control which records to collect. You could use this option if you want to quickly spot check a collector configuration rather than waiting for all the data to be collected.

  • Raw data

    Raw data contains attribute names from the native application. These attributes have not yet been transformed based on the mappings in the collector. Testing the raw data collection lets you verify that you are collecting the data you intend to collect before Identity Governance transforms it.

  • Transformed data

    Transformed data contains attribute names that you have mapped from the native application to the attribute names you are using within Identity Governance. Testing the transformed data collection lets you verify that your mappings within the data collector meet your expectations.

To test a sample collection from a data source:

  1. Log in as a Global or Data Administrator.

  2. Select a configured data source.

  3. Select Test Collection and Troubleshooting.

  4. Under Test Collection, select the collectors, and then select Run Test Collection.

  5. Select the specific entities to collect.

  6. (Conditional) To collect a subset of records, type the number of records to collect.

  7. (Conditional) To collect all records, make no changes to the default All value.

  8. Select raw data or transformed data collection to run.

  9. After the test collection shows Complete, select Action to view, download, or delete the test collection results.

4.5.4 Creating Emulation Packages

You can more easily troubleshoot collection configuration outside your production environment by creating emulation packages for data collectors. An emulation package contains CSV files with the raw collected data from the data source and a CSV file containing data source configuration details. Emulation packages remain available in Identity Governance until you delete them.

To create an emulation package:

  1. Select a configured data source.

  2. Select Test Collection and Troubleshooting.

  3. Under Download and Emulation, select Create emulation package.

  4. When the emulation status shows Complete, select Action to view, download, or delete the emulation package.

4.5.5 Downloading and Importing Collectors

The ability to download and import collectors helps you manage your environment in several ways.

  • Back up a working collector

  • Replicate an environment

  • Update collector details in a text editor

  • Troubleshoot collections

Configuring collectors can take time, and you might go through several iterations of trial and error. When you have configured a collector that achieves the results you want, you should download it and save it with your other backup files. You can also use downloaded collectors to replicate an environment, either in a test environment or to use in another office location.

You could decide that you need to change the predefined attribute mappings and value transformation policies of a template to meet your specific environment. If you find that you need to customize a collector template, rather than only editing the values in a collector, you can download and import collector templates under Configuration in Identity Governance. For more information, see Customizing the Collector Templates for Data Sources.

NOTE:To correctly import data, you must download data sources from the current version of Identity Governance.

When you download a data source, the zipped file has the name of the data source. For example, AD_Identities.zip. The files within the zipped file are generically named in English and can include the following files:

  • Identity_Source.json or Application_Source.json file (depending on type of data source) which contains the configuration of the data source and all of its collectors.

  • Attribute files containing the schema elements used by the collectors within the data source. For example, USER_Attributes.json, PERMISSION_Attributes.json, and APPLICATION_attributes.json.

  • Template files containing the collector template name and version used to create the collectors in the data source. For example, Template_AD-Account_3.6.0.json.

  • Categories.json file when categories are applied to the source.

To download data source and associated files:

  1. Select a data source, then select Test Collection and Troubleshooting.

  2. Select Download and Emulation.

  3. Click Download Data Source Configuration.

    1. Type a meaningful description such as the collector name.

    2. (Optional) Download included templates, assigned categories, and associated attribute definitions.

    3. Select the download icon on the top title bar to access the saved file and download the file.

      HINT:We recommend creating a folder for each data source zipped file and extracting the contents into that folder. This ensures that the similarly named files from different sources are not mixed together or overwrite those from other sources.

To import associated files and data source:

  1. (Conditional) If your data source has custom schema or categories associated with it, import the previously downloaded schema files or category files before importing the data source. To import attributes definitions, navigate to the respective attribute page under Data Administration and import respective attribute file. To import categories and templates, select respective options under Configuration.

  2. Under Data Sources, select Identities or Applications.

  3. Select Import an identity source or Import an application source.

  4. Based on the type of data source, select the Identity_Source.json or the Application_Source.json file.