2.1 Preferences

Analyzer preferences let you configure some general environment settings, and include the following:

2.1.1 Connections

The Connections preferences page lets you manage the Java Libraries that Analyzer needs to communicate with different applications. You can add and remove Java libraries (typically .jar files) from this page.

2.1.2 Data Browser

The Data Browser preferences page lets you manage how the Data Browser displays the data records in a data set instance. To access these settings, select Window > Preferences, then select Analyzer > Data Browser in the left navigation area.

Table 2-1 Analyzer Preferences - Data Browser

Setting

Description

Show Display settings for attributes when opening the Data Browser

Enables/disables displaying a warning dialog box about having more than 12 attributes with multiple values in a data set definition. Use this dialog box to limit the attributes displayed in the Data Browser to improve performance during data inspection and analysis.

Total number of rows per page

Specifies the number of records to display on a page in the Data Browser.

2.1.3 Database Settings

Analyzer database settings configure the database that Analyzer uses to organize and store its data. To access these settings, select Window > Preferences, then select Analyzer > Database Settings in the left navigation area.

You can also select the Preferences icon in the Data Set Instances view.

Table 2-2 Analyzer Preferences - Database Settings

Setting

Description

Select Alias

Specifies the database alias that Analyzer uses.

Add Alias

Opens the New Alias dialog box so you can configure a new database alias. The new alias requests the same information that is displayed in the Database Settings page.

Remove Alias

Deletes the currently selected database alias. However, you must have at least one database alias for Analyzer to run.

Driver

Specifies the driver that Analyzer uses to access the database. Analyzer supports either an internal HSQLDB, or an external MySQL database.

Use SSL

Enables SSL connections between Analyzer and an external MySQL database used as the Analyzer database.

To use SSL, the MySQL JDBC library must be version 5.1.6 or higher. For example, mysql-connector-java-5.1.6-bin.jar.

NOTE:MySQL must be configured to use SSL before enabling Use SSL. For more information, see Using SSL for Secure Connections in the MySQL Reference Manual.

JDBC URL

Specifies the JDBC URL where you can access the database. When you are using the internal HSQLDB, this field is not configurable.

Database

Specifies the database name.

Batch Size

Specifies the batch size of the records that Analyzer imports from the external database. By default, the batch size is set to 1000.

You can configure batch size in Window > Preferences > Analyzer > Database Settings > Batch Size. This setting is only available if the external database is MySQL.

Store Password

Saves the password in a local system. If it is a local database (HSQLDB), the password is by default stored locally. If it is an external database (MYSQL), user can select/deselect this option.

Username

Specifies a valid username with which Analyzer can access the database.

Password

Specifies the database access password for the specified user.

JDBC Paths

Specifies the path to the database’s JDBC libraries. You must provide a classpath if you are using an external database with Analyzer.

Class Name

The class name for the driver.

Test Connection

Tests the database connection by using the specified database parameters.

Restore Defaults

Resets the database settings to Analyzer’s default (internal database) configuration.

In case of Flat File import, Analyzer can read 1000 records at a time and write them to the Analyzer MySQL database server. The bigger the batch size, the faster is the import performance. The performance of importing data also depends on the packet size of the database server. The packet size is based on various factors including the number of fields or attributes, the size of the value of each field or attribute and the batch size. If you get Error inserting data com.mysql.jdbc.PacketTooBigException: Packet for query is too large (9525623 > 1048576) exception in the error log, you can change the value of the packet on the server by setting the max_allowed_packet variable. Analyzer's MySQL database server throws this exception because of the limited data size on the database server. You should reduce the Analyzer batch size and retry the import. Another workaround is to increase the value for the max_allowed_packet variable in the MySQL database server.

Analyzer allows you to change its internal database from the default HSQLDB to a MySQL database. You can configure database settings in Window > Preferences > Analyzer > Database Settings. When you use an external MySQL database, be aware of that the MySQL database uses the default character set from the operating system for encoding table fields. If an extended or double-byte character is not recognized by the default character set, Analyzer displays ??? in the Data Browser. To avoid this, set the operating system’s default character set to UTF-8, or to a character set that includes all the extended or double-byte characters that Analyzer might import.

Ensure that you shift from the local database (HSQLDB) to the external database (MYSQL) if you are importing data more than 40,000 records.

2.1.4 Matching

Matching settings define the default configuration for Analyzer’s matching analysis. To access these settings, select Window > Preferences, then select Analyzer > Matching in the left navigation area.

Table 2-3 Matching Preferences

Setting

Description

Uniqueness Test Percentage Required

Specifies the uniqueness threshold that a matching key/data set combination must achieve before Analyzer can run a matching analysis.

Analyzer supports values between 95 and 100 percent.

2.1.5 Reporting

Reporting settings define the default configuration for the various reports you can generate from Analyzer. These are global settings that apply to all of your Analyzer projects. To access these settings, select Window > Preferences, then select Analyzer > Reporting in the left navigation area.

The Reporting settings page includes four tabs, one for each type of report you can generate. For more information about Analyzer reports, see Section 3.11, Auditing and Reporting.

Data Browser

The Data Browser report settings lets you specify the following default settings.

Table 2-4 Data Browser Report Settings

Setting

Description

Report Title

Specifies a name for the Data Browser report you are generating. The default title is Data Browser Report.

Notes

Specifies optional details describing the report.

Display in Landscape Mode

Displays the report suitable for landscape printing (11” x 8.5” rather than 8.5” x 11”).

Wrap text when values overflow

Allows cell data to wrap around within the report column. If you deselect this option, data that exceeds the width of the report column is truncated.

Display added/modified/deleted icons

Specifies that the report should include icon indicators for added, modified, and deleted data cells.

Display metric failures in red and with icons

Specifies that the report should indicate data cells that failed the analysis in red and with a failure icon.

Analysis

The Analysis report settings lets you specify the following default settings.

Table 2-5 Analysis Report Settings

Setting

Description

Report Title

Specifies a name for the Analysis report you are generating. The default title is Analysis Report.

Notes

Specifies optional details describing the report.

Display in Landscape Mode

Displays the report suitable for landscape printing (11” x 8.5” rather than 8.5” x 11”).

Wrap text when values overflow

Allows cell data to wrap around within the report column. If you deselect this option, data that exceeds the width of the report column is truncated.

Display a combined summary of all datasets

Specifies that the report should include a section that summarizes the analysis results across all analyzed data sets. This option is available only when the analysis results include data from multiple data sets.

Display graphic for each metric

Displays a graphical representation of the results of each metric in the analysis.

Display Failed Data/Details for each metric

For each metric in the analysis, displays a list of the failed records.

Max number of records to display in Failed Data or Patterns

Specifies a maximum number of records to display that failed the analysis.

This setting is active only when Display Failed Data/Details for each metric is selected.

Uniqueness

The Uniqueness report settings lets you specify the following default settings.

Table 2-6 Uniqueness Report Settings

Setting

Description

Report Title

Specifies a name for the Uniqueness report you are generating. The default title is Uniqueness Report.

Notes

Specifies optional details describing the report.

Display in Landscape mode

Displays the report suitable for landscape printing (11” x 8.5” rather than 8.5” x 11”).

Wrap text when values overflow

Allows cell data to wrap around within the report column. If you deselect this option, data that exceeds the width of the report column is truncated.

Display statistical information

Displays a statistical summary of the report data, including number of records, duplicate count, a uniqueness measure, and the total number of duplicate records.

Display graphic

Displays a graphical (bar graph) representation of the report data.

Display uniqueness data

Displays a list of the duplicate values along with the number of times the value occurs in the data set instance.

Max number of duplicate records to display

Specifies the maximum number of duplicate values to display.

This setting is active only when Display duplicate data is selected.

Matching

The Matching report settings lets you specify the following default settings.

Table 2-7 Matching Report Settings

Setting

Description

Report Title

Specifies a name for the Matching Value report you are generating. The default title is Matching Report.

Notes

Specifies optional details describing the report.

Display in Landscape mode

Displays the report suitable for landscape printing (11” x 8.5” rather than 8.5” x 11”).

Wrap text when values overflow

Allows cell data to wrap around within the report column. If you deselect this option, data that exceeds the width of the report column is truncated.

Display statistical information

Displays a statistical summary of the report data, including number of records, duplicate count, a uniqueness measure, and the total number of duplicate records.

Display graphic

Displays a graphical (bar graph) representation of the report data.

Display unmatched data

Displays a list of unmatched values from the matching analysis.

Max number of unmatched records to display

Specifies the maximum number of unmatched values to display.

This setting is active only when Display unmatched data is selected.

Display matched data

Displays a list of matched values from the matching analysis.

Max number of matched records to display

Specifies the maximum number of matched values to display.

This setting is active only when Display matched data is selected.