FieldXform

A FieldXform object is used to normalize data values in an HVEvent object.

Properties

  • name: (Required) Name of the xform.

  • String fieldName: (Required) Name of the field to be read by the xform.

  • String matchExpression: (Required) Regular expression describing field values that should be transformed.

  • String replaceExpression: (Required) A regular expression describing how to transform values that match the matchExpression.

  • String destFieldName: Name of the field where transformed data should be written.

  • Object caseRule: Rule for changing the case of matching fields.

Methods

  • isValid(): Returns True if all required fields are populated and the expression fields contains a valid regular expressions.

    If False is returned, the scripting log contains a description of the failure.

  • toString(): Returns a string representation of this object.

Description

A FieldXform object is used to normalize data values in an HVEvent object. The name, fieldName, matchExpression, and replaceExpression fields in this object are required. In addition, the matchExpression field must contain a valid regular expressions as described here:

http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html.

The replaceExpression can contain both literal data, as well as references to captured substrings from the matchExpression. Capture references are referenced using a dollar sign ($) followed by the index of the desired capture field, starting with index 1 to indicate the first capture field. A literal dollar sign must be preceded with a backslash to indicate that it is not a capture reference. The fieldName value is normally one of standard field names described in metrics.constants Field Names. However, custom field names created with a previous FieldXform can be used. HVEvents not containing a value in the field specified by fieldName never match.

When an HVEvent is applied to a FieldXForm, the FieldXForm compares the value of the specified field with its matchExpression. If the value does not match the expression, then no changes are made to the HVEvent. The the value does match, then the replaceExpression is used to transform the value, and if specified, the caseRule is used to normalize the case of the transformed value. The resulting value is then save to the destination field, or if not specified, back to the field from which the source value was taken.

The example for FieldXform (see Example) is taken from the DemoMetrics.js script included in the Experience Manager installation. It is designed to identify the same page across multiple hosts, while suppressing any parameters passed to the page. To do this, it evaluates the URL field in the HVEvent against the match expression:

http[s]?://[^/]+([^#\?]+).*

Table A-1 lists the regular expressions, and describes how the configured matchExpression is interpreted.

Table A-1 matchExpression Interpretations

Match Characters

Interpretation

 http

Matches only the literal string http.

 [s]?

Matches 0 or one “s” in this position.

 ://

Matches only the literal string “://” in this position.

 [^/]+

Matches one or more of any character up to but not including the / character.

 ([^#\?]+)

Matches one or more of any character up to but not including # or ? characters.

The parentheses mark this section as a capture field that is referenced in replaceExpression.

 .*

Matches any number of trailing characters of any type.

Based on this match expression, the xform saves the value of the first capture field ($1), converts it to lower case, then saves it to a custom HVEvent field called urlKey. Table A-2 shows sample URL values and the resulting urlKey values produced by this xform:

Table A-2 Sample URL and resulting urlKey values for the xform of the matchExpression example

URL

UrlKey

http://www.novell.com/products/experience-manager/

/products/experience-manager/

http://localhost:8080/Plugins/invoke.pl?op=threadDump

/plugins/invoke.pl

http://us.mc1102.mail.yahoo.com/mc/welcome?.gx=1&.tm=1250689315&.rand=3hsbmdsrkkmg6

/mc/welcome

This approach obviously does not work for all Web sites. You should come up with your own schemes for normalizing data into useful values.

Example

var ulx= metrics.createXform('URLKey Xform');
ulx.fieldName=metrics.constants.FN_URL;
ulx.matchExpression='http[s]?://[^/]+([^#\?]+).*';
ulx.replaceExpression='$1';
ulx.destFieldName='urlKey';
ulx.caseRule=metrics.constants.CASE_TO_LOWER;