13.1 Raw Data Storage

Sentinel compresses the raw data and stores it in protected partitions that are based on the time and the event source. New raw data files are created every hour. The data is moved from the primary, compressed, file-based storage to a user-configured, compressed secondary storage location on a regular basis.

Sentinel stores the raw data files in one of the following locations:

  • Primary storage location: <Sentinel data directory>/rawdata/online

  • Secondary storage location: <Sentinel archive directory>/rawdata_archive

The compressed raw data files are moved from the primary storage to the secondary storage location.

The following table describes the directory structure of the raw data in the primary storage under the installation directory:

Table 13-1 Raw Data Directory Structure

Directory Structure

Description

/data

The primary directory for all data storage.

/data/rawdata

The subdirectory where all raw data is stored.

/data/rawdata/online

The directory where all the raw data in the primary storage is stored.

/data/rawdata/online/EventSource UUID

There is one subdirectory for each event source under the online subdirectory. That subdirectory contains all raw data received from that event source.

The subdirectory name is the universally unique identifier (UUID) of the event source (for example, E20D0840-1E0A-102C-9F30-000C2949BA91).

/data/rawdata/online/EventSource UUID/Month

Data in the event source subdirectory is partitioned by month. Each month has its own subdirectory.

The subdirectory name is in the yyyy-mm format. For example, 2009-05 indicates May 2009.

/data/rawdata/online/EventSource UUID/Month/1 Hour Data Files

Each file in the Month directory contains data received during a specific one-hour period. Most data in the file has a time stamp that is within the one-hour period.

The name of the file indicates the day of the month and the one-hour period that is represented.

The filename format is dd-hhmm.extension.

dd is the day of the month.

hh is the hour of the day.

mm is the minute of the hour.

The extension is either.gz. or .open.

NOTE:Raw data files are compressed and have the extension .gz. However, when the raw data file is being written into, the raw data file appears with the extension.open.

For example:

A filename of 08-0000.gz indicates that the file contains compressed data received on the 8th day of the month between 12.00 a.m. and 01.00 a.m.

A filename of 08-1300.open indicates that the file contains data received on the 8th day of the month between 01.00 p.m. and 02.00 p.m.

If the raw data files are stored in the primary storage location, the full path name of the file is in the following format:

<Sentinel data directory>/rawdata/online/<event source UUID>/<Date>/<RawDataFile>

For example:

/var/opt/novell/sentinel/data/rawdata/online/A75CF6A0-4948-102D-A615-000C29A9C3DB/2010-05/24-0600.gz

In this example, /var/opt/novell/sentinel/data is the data directory for Sentinel.

If the raw data files are stored in the secondary storage location, the full path name would be as follows:

<Sentinel archive directory>/rawdata_archive/<event source UUID>/<Date>/<RawDataFile>

For example:

/sentinel_archive_data/rawdata_archive/A75CF6A0-4948-102D-A615-000C29A9C3DB/2010-05/24-0600.gz

In this example, /sentinel_archive_data is the secondary storage directory configured by the user.

13.1.1 Raw Data Representation

Each raw data event is represented as a single line in a raw data file. Each line is a JSON object with the following format:

{ 
   "EventDate": "<date>", 
   "EventRecordID:" "<event record uuid>", 
   "RawData": "<raw data>", 
   "RawDataHash": "<SHA256 hash of raw data, in hex format>", 
   "EventSourceManagerID", "<uuid of event source manager>", 
   "CollectorID", "<uuid of collector>", 
   "EventSourceID:", "<uuid of event source>", 
   "ChainID", "<chain ID>", 
   "ChainSequence", "<Sequence number>" 
}

The following table describes each of the fields in the raw data event:

Table 13-2 Raw Data Representation

Field Name

Description

EventDate

The date and time when Sentinel received this event and not the date and time when the event occurred.

Example: “05/24/2010 06:15:06.676”

EventRecordID

The unique ID identifying the raw data record.

Example: "595829C0-1C8F-102C-A922-000C2949BA91"

If an event was generated as a result of parsing a raw data record, this ID is set in the event RecordID field. Because of filtering, not all raw data records result in an event.

RawData

The original raw data received by the event source.

RawDataHash

The SHA-256 hash of the RawData value represented as a HEX string. The hash is calculated by converting the RawData value to a UTF-8 string and then performing the hash over that string.

To detect tampering, each raw data event is stored with a SHA-256 hash value.

Example: cc661009e2f3dc565c0c7fe25b705219004dcd8132c0b0a7e987bfdcb55e49cf

EventSourceID

The UUID of the event source from which the raw data originated.

Example: A2A0C600-1C6C-102C-A781-000C2949BA91

EventSourceGroupID

The UUID of the event source group (Connector) to which the event source was connected when the raw data was received.

Example: A2A0C600-1C6C-102C-A77A-000C2949BA91

Different raw events from the same event source can have different event source group IDs, because event sources can be moved from one Connector to another.

CollectorID

The UUID of the Collector that the Connector and event source were connected to when the raw data was received.

Different raw events from the same event source can have different Collector IDs, because event sources and event source groups can be moved from one Collector to another.

Example: A2A0C600-1C6C-102C-A779-000C2949BA91

EventSourceManagerID

The UUID of the Event Source Manager (Collector Manager) object where this raw data was received.

Example: C76D2820-C395-1029-BB86-001321B5C0B3

ChainID

A random number that identifies a raw data chain. Whenever an event source is stopped and restarted between generation of raw data events, a new ChainID number is generated.

To detect tampering, each raw data event is stored with a ChainID and a ChainSequence number.

Example: 1241630654754

ChainSequence

A sequence number within a particular raw data chain.

The raw data events in a given raw data chain must have an uninterrupted sequence of numbers starting with 0. In addition, all raw data events in a given raw data chain must appear sequentially in the files, with no other chains intermixed. If a raw data chain can span files, the sequence should continue uninterrupted into the file that represents every hour during which raw data was received.

Example: 4

If no raw data is received for the one-hour period, the file records only from the next arrival of raw data. Nonetheless, the raw data chain sequence should continue uninterrupted until a new raw data chain begins. A new raw data chain is signaled by a changed ChainID value, and a ChainSequence value of zero (0).

The following examples show three raw data records:

{ 
   "EventDate":"05\/24\/2010 06:15:06.676", 
   "EventRecordID":"A75CF6A0-4948-102D-A61C-000C29A9C3DB", 
   "RawData":"Sep 22 10:22:00 testhost Message #100", 
 "RawDataHash":"7003c0e0be4ddf43a3b49026a37483f59c7f839950f581ec9fde5dea43da90f5", 
   "EventSourceManagerID":"C76D2820-C395-1029-BB86-001321B5C0B3", 
   "CollectorID":"A75CF6A0-4948-102D-A613-000C29A9C3DB", 
   "EventSourceGroupID":"A75CF6A0-4948-102D-A614-000C29A9C3DB", 
   "EventSourceID":"A75CF6A0-4948-102D-A615-000C29A9C3DB", 
   "ChainID":"1274696106664", 
   "ChainSequence":"0" 
} 
{ 
   "EventDate":"05\/24\/2010 06:15:07.358", 
   "EventRecordID":"A75CF6A0-4948-102D-A624-000C29A9C3DB", 
   "RawData":"Sep 22 10:22:00 testhost Message #99", 
 "RawDataHash":"f5681ba965144d2d22b13188767d94540b5fe57904afcee5821854bde2afca72", 
   "EventSourceManagerID":"C76D2820-C395-1029-BB86-001321B5C0B3", 
   "CollectorID":"A75CF6A0-4948-102D-A613-000C29A9C3DB", 
   "EventSourceGroupID":"A75CF6A0-4948-102D-A614-000C29A9C3DB", 
   "EventSourceID":"A75CF6A0-4948-102D-A615-000C29A9C3DB", 
   "ChainID":"1274696106664", 
   "ChainSequence":"1" 
} 
{ 
   "EventDate":"05\/24\/2010 06:15:07.988", 
   "EventRecordID":"A75CF6A0-4948-102D-A62A-000C29A9C3DB", 
   "RawData":"Sep 22 10:22:00 testhost Message #98", 
"RawDataHash":"98435b5dba95633699b88d07782109876e8ceb4169d567602f2c92657118645d", 
  "EventSourceManagerID":"C76D2820-C395-1029-BB86-001321B5C0B3", 
   "CollectorID":"A75CF6A0-4948-102D-A613-000C29A9C3DB", 
   "EventSourceGroupID":"A75CF6A0-4948-102D-A614-000C29A9C3DB", 
   "EventSourceID":"A75CF6A0-4948-102D-A615-000C29A9C3DB", 
   "ChainID":"1274696106664", 
   "ChainSequence":"2" 
} 

13.1.2 Disabling Raw Data Collection

By default, raw data collection is enabled for the Collector Manager on the Sentinel server. Collecting raw data can impact the performance of the server or the remote Collector Manager. Perform the following procedure on any Collection Manager where you want to disable raw data collection:

  1. Open the /etc/opt/novell/sentinel/config/event-router.properties file in a text editor.

    This is the default location of the file.

  2. Change esecurity.router.event.rawdata.send=true to esecurity.router.event.rawdata.send=false.

  3. Save the file, then restart the Collector Manager.