5.2 Using the Regular Expression Filters

A regular expression is a pattern that describes a specific portion of text. Few Exchange Online Knowledge Scripts allows you to use regular expressions to define inclusion or exclusion filters for pattern-matching against the text being evaluated.

The following table lists some commonly used regular expression types and their usage.

For more information about regular expression syntax, see related Web sites such as www.wikipedia.org/wiki/Regular_expression or www.regular-expressions.info.

Regular Expression Type

Description

Alternate Matches

A pipe character, |, indicates alternate possibilities. For example:

  • The expression a|b|c indicates a match with a, or b, or c.

  • The expression Exchange Online|Office Subscription|Skype for Business indicates a match with Exchange Online, or Office Subscription, or Skype for Business.

Anchor

Anchors do not match characters. Instead, they match a position before, after, or between characters. They anchor the regular expression match at a certain point.

  • A ^ matches a position before the first character in a text string. For example, the expression ^a applied to the text string abc returns a because a is at the beginning of the text string. The expression ^b applied to the same text string returns no value, because b is not at the beginning of the text string.

  • A $ matches a position right after the last character in a text string. For example, the expression c$ applied to the text string abc returns c because c is at the end of the text string. The expression a$ applied to the same string returns no value, because a is not at the end of the text string.

Escape Metacharacter

A backslash character, \, preceded with special characters such as ., @, |, *, ?, +, (, ), {, }, [, ], ^, $ and \ forces the special characters to be interpreted as normal characters.

For example:

  • A dot (.) is usually used as a wildcard metacharacter, but if preceded by a backslash it represents the dot character itself. For information on wildcard metacharacter, see Wildcard.

  • A colon (:) when preceded by a backslash excludes or includes all device names that contains : in their names.

  • An equal sign (=) when preceded by a backslash excludes or includes all device names that contains = in their names.

Literal

A literal expression consists of a single character that matches all the occurrences of that character in the text string.

For example, if the expression is a and the text string is The gray cat is purring, then the match is the a in gray and a in cat.

All characters except for the following are literals:

., |, *, ?, +, (, ), {, }, [, ], ^, $ and \.

These characters are treated as literals when preceded by a \.

Matching Characters or Digits

  • \d: Matches a digit.

  • \D: Matches a non-digit.

  • \s: Matches a whitespace character.

  • \S: Matches any character except a whitespace.

  • \w: Matches an alphanumeric character.

  • \W: Matches an non-alphanumeric character.

Parentheses

Use parentheses, (), to group characters and then apply a repetition operator to the group.

For example, the expression (ab)* returns all of the string ababab.

Repeat

A repeat is an expression that is repeated an arbitrary number of times.

  • A question mark, ?, indicates that the preceding character in the expression is optional. For example, the expression ba? returns b or ba.

  • An asterisk, *, indicates that the preceding character is to be matched zero or more times. For example, the expression ba* returns all instances of b, ba, baaa, and so on.

  • A plus sign, +, indicates that the preceding character is to be matched one or more times. The expression ba+ returns all instances of ba or baaaa, for example, but not b.

  • Curly braces, {}, indicate a specific amount of repetition. For example, the expression a{2} returns the letter a repeated exactly twice. The expression a{2,4} returns the letter a repeated between 2 and 4 times. The expression a{2,} returns the letter a repeated at least twice, with no upper limit. For example, the expression ba{2,4} returns baa, baaa, and baaaa.

Square Brackets

Use square brackets, [], to match any one of the characters that is enclosed in the brackets. You can specify a range of characters by using a hyphen.

For example, the expression [a-i] that performs the same match as [abcdefghi] returns all services that match any one of the characters inside the square brackets such as ‘a’ in “Azure Information Protection”, ‘e’ in “Exchange Online” and ‘i’ in “Identity Service”

Wildcard

The dot wildcard, .,matches any single character except line break characters.

For example, the expression gr.y matches gray, grey, gr%y, and so on.

Word Boundary

  • \b: Matches a zero-width word boundary, such as between a letter and a space. For example: er\b matches the er in never but not the er in verb.

  • \B: Matches a word non-boundary. For example: er\B matches the er in verb but not the er in never.