A.1 Basic Search Query

A basic query is a search for a value on a field. The syntax is as follows:

msg:<value>

The field name (msg) is separated from the value by a colon.

For example, to search for a phrase that includes the word “authentication,” you can specify the search query as follows:

msg:authentication

Or, to search for events of severity 5, you can specify the search query as follows:

sev:5

If the value has spaces or other delimiters in it, you should use quotation marks. For example:

msg:"value with spaces"

Sentinel classifies event fields as either tokenized fields or non-tokenized fields. A tokenized field is indexed and is searched differently than a non-tokenized field.

A.1.1 Case Insensitivity

Indexing and searching in Sentinel is not case-sensitive. For example, the following queries are all equivalent:

msg:AdMin
msg:admin
msg:ADMIN

A.1.2 Special Characters

If you include special characters as part of a search, the special characters must be escaped. These characters are as follows:

+ - && || ! ( ) { } [ ] ^ " ~ * ? : \ /

Use “ \” before the character you want to escape. For example, to search for ISO/IEC_27002:2005 in the rv145 (Tag) field, use the following query:

rv145:ISO\/IEC_27002\:2005

You can also use quotation marks around the query:

rv145:"ISO/IEC_27002:2005"

If the value contains quotation marks, you must escape it by using the “\” character instead of quotation marks. For example, to search for “system “mail” service” in the initiatorservicename field, you must specify the query as follows:

sp:"system \"mail\" service"

For more information on quoting wildcard characters, see Quoted Wildcards.

A.1.3 Operators

Lucene supports AND, OR, and NOT Boolean operators, which allow words to be combined. Boolean operators must be always capitalized.

OR Operator

The OR operator is the default conjunction operator. If there is no Boolean operator between two clauses, the OR operator is used. The OR operator links two clauses and finds a matching event if either of the clauses is satisfied. The symbol || can be used in place of the word OR. For example, consider the following query:

sun:admin OR dun:admin

This query finds events whose initiator username or target username is “admin.” The following query produces the same result because OR is used by default:

sun:admin dun:admin

AND Operator

The AND operator links two clauses and finds a matching event only if both clauses are satisfied. The symbol && can be used in place of the word AND. For example, consider the following query:

sun:admin AND dun:tester

This query finds events whose initiator username is admin and the target username is tester.

NOT Operator

The NOT operator excludes events that match the clause after the NOT. The symbol ! can be used in place of the word NOT. For example, consider the following query:

sev:[0 TO 5] NOT st:I NOT st:A NOT st:P

This query matches all events whose severity is between 0 and 5, but excludes those whose sensor type is I (internal), A (audit), or P (performance); that is, it excludes Sentinel internal events.

The NOT operator cannot be used by itself because it is a way to exclude events from a set that has been found by other search terms. For example, consider the following query:

NOT st:I NOT st:A NOT st:P

This query might seem like it should return all events where the sensor type is not I, A, or P. However, it is an invalid query because a query cannot begin with the NOT operator.

Operator Precedence

Parentheses can be used in the usual way to change operator precedence. They can be nested to any depth, as shown in the following examples:

(sun:admin OR dun:admin) AND (sip:10.0.0.1 OR sip:10.0.0.2)

((sun:admin OR dun:admin) AND (sip:10.0.0.1 OR sip:10.0.0.2)) OR (msg:user AND evt:authentication)

A.1.4 The Default Search Field

Lucene uses a default search field, which is the field that is searched if no field is specified. In Sentinel, _data is the default search field. By default, the default search field is a concatenation of the following event fields:

evt,msg,sun,iuid,dun,tuid,sip,sp,dip,dp,rv42,shn,rv35,rv41,dhn,rv45,obsip,sn,obsdom,obssvcname,ttd,ttn,rv36,fn,ei,rt1,rv43,rv40,isvcc

The default search field is indexed and searched as a tokenized field. The result is that you can search for words that might appear in any event field.

You can also customize the set of event fields that are concatenated in the default search field by adding the indexedlog.datafield.ids property in the configuration.properties file. For more information, see Customizing the Default Search Field in the Sentinel Administration Guide.

For example, suppose you have two non-tokenized fields in an event, sun (initiatorusername) and dun (targetusername). The sun field has the following value:

report-administrator

The dun field has the following value:

system-tester

The _data field contains the concatenation of these fields separated by a single space character:

report-administrator system-tester

Because the _data field is a tokenized field, the words “report,” “administrator,” “system,” and “tester” are indexed and searchable. The following queries would find this event:

report

_data:report

report-administrator

_data:report-administrator

report tester

In addition, the following queries also find this event:

sun:report-administrator

dun:system-tester

A.1.5 Tokenized Fields

Fields that are classified as tokenized fields are parsed into individual words for indexing. Therefore, a search occurs only on words within the field value. Characters that are considered to be word delimiters are not searchable, nor are words that are considered to be stop words. Lucene removes extremely common words to save disk space and speed up searching. These words are ignored in search filters. Currently, the following stop words are removed:

a
an
and
are
as
at
be
but
by
for
if
in
into
is
it
no
not
of
on
or
such
that
the
their
then
there
these
they
this
to
was
will
with

When it does a search, Lucene examines all of the words in a field and tries to match words in the search value. For example, suppose that you specify a search for messages containing the following value:

msg:"user-authentication failed on the server"

The words that are parsed within this value are “user,” “authentication,” “failed,” and “server.” These are the only search words that would match this value. “On” and “the” are omitted because they are stop words.

The value has the hyphen character (-) between some words. Hyphens are treated as word delimiters, so Lucene does not search for hyphens. Consider, the following query:

msg:"user-authentication"

The results might not be exactly what you expect. The query search value matches the value, but not because it is matching the hyphen. It matches because Lucene first parses the words in the search value and identifies the words “user” and “authentication.” Lucene then matches those words against values that have the words “user” and “authentication” with no intervening words in between. This query would also match the following value, even though there is no hyphen between “user” and “authentication”:

user authentication has failed on the server

Consider the following query:

msg:"failed on server"

This query has the stop word, "on," which is ignored. However, the stop word does affect the relative positioning that is expected to be between words when evaluating a value to see if it matches. The “failed on server” search matches any phrase where the words failed and server are separated by exactly one word. It does not matter what the word is because the separating word is a stop word and is ignored. Thus, the above query would match all of the following:

failed on server

failed-on server

failed a server

failed-a-server

Proximity indicators created by using the ~ character followed by a value, make this more complicated. The query dictates an expected distance between words. In the “failed on server” query, the expected distance between “failed” and “server” is one word. The proximity indicator specifies how much variance there can be from the expected distance. For example, consider the following query, where a proximity indicator of one (~1) is specified:

msg:"failed on server"~1

This query indicates that the distance between “failed” and “server” could be plus or minus one from the expected distance, which is one because of the stop word “on.” Thus, the distance could be 1, 1-1 (0), or 1+1 (2). Thus, all of the following would match:

failed on server

failed on the server

failed finance server

As of Lucene version 3.1, word parsing is done according to word break rules outlined in the Unicode Text Segmentation algorithm. For more information, see Unicode Text Segmentation.

For information on tokenized fields in Sentinel, in the Sentinel Main interface click Tips on the top right corner of the Sentinel Main interface. A table is displayed that lists all the event fields and whether an event field is searchable or not.

A.1.6 Non-Tokenized Fields

Fields that are classified as non-tokenized fields are parsed fully for indexing. Thus, a search occurs on full field values. For example, to search events whose initiatoruserfullname (iufname) field has the value “Bob White”, you must specify the query as follows:

iufname:"Bob White"