6.1 Data Storage Considerations

Depending on the EPS rate, you can choose to use traditional storage or scalable storage to store and index your Sentinel data. Your Sentinel deployment depends on the data storage option you choose to use.

Table 6-1 Comparison between Traditional Storage and Scalable Storage

Traditional Storage	Scalable Storage
By default, data is stored in file-based traditional storage and indexing is done locally on the Sentinel server. In addition to file-based data storage, you can also choose to store and index events in the Visualization Data Store to leverage data visualization capabilities. For more information, see Configuring the Visualization Data Store.	Data is stored in Hadoop-based scalable storage and uses scalable distributed indexing mechanism to index data.
Seamlessly scales up to approximately 20000 EPS. Beyond that you must add additional Sentinel servers to scale up to much higher EPS.	Seamlessly scales out to a very large EPS, for example, 1 million events per second.
Data collection is load-balanced across several Sentinel servers. Therefore, data is spread across different Sentinel servers and should be managed individually.	Data collection is managed by a single Sentinel server. Therefore, data management and resource management is centralized on a single Sentinel server.
Data is labeled tenant-wise but not segregated tenant-wise on disk.	Data is labeled and segregated on disk tenant-wise.
Data replication and availability must be done either manually or by using expensive storage mechanisms such as SAN disk.	Data replication and availability is cost-effective since Hadoop runs on commodity hardware.

6.1.1 Planning for Traditional Storage

File-based data storage has a three-tier structure:

Online Storage	Primary storage, formerly known as local storage.	Optimized for quick writes and fast retrieval. Stores the most recently collected event data and the most frequently searched event data.
	Secondary storage, formerly known as network storage. (optional)	Optimized to reduce space usage on optionally less expensive storage while still supporting fast retrieval. Sentinel automatically migrates data partitions to the secondary storage.
	NOTE:Using the secondary storage is optional. Data retention policies, searches, and reports operate on event data partitions regardless of whether they are residing on primary or secondary storage, or both.
Offline Storage	Archival storage	When the partitions are closed, you can back up the partition to any file storage service, such as Amazon Glacier. You can temporarily re-import the partitions for use in long-term forensic analysis whenever necessary.

You can also configure Sentinel to extract the event data and event data summaries to an external database by using data synchronization policies. For more information, see Configuring Data Synchronization in the Sentinel Administration Guide.

When you install Sentinel, you must mount the disk partition for primary storage in the location where Sentinel will be installed, by default the /var/opt/novell directory.

The entire directory structure under the /var/opt/novell/sentinel directory must reside on a single disk partition to ensure correct disk usage calculations. Else, the automatic data management capabilities might delete event data prematurely. For more information about the Sentinel directory structure, see Sentinel Directory Structure.

As a best practice, ensure that this data directory is located on a separate disk partition than the executables, configuration, and operating system files. The benefits of storing variable data separately include easier backup of sets of files, simpler recovery in case of corruption, and provides additional robustness if a disk partition fills up. It also improves the overall performance of systems where smaller file systems are more efficient. For more information, see Disk Partitioning.

NOTE:There is a limitation in ext3 file systems for file storage, which prevents a directory from having more than 32000 files or subdirectories. You can use XFS file system if you are going to have a large number of retention policies or if you are going to retain the data for longer periods of time, such as an year.

Using Partitions in Traditional Installations

On traditional installations, you can modify the disk partition layout of the operating system before installing Sentinel. The administrator should create and mount the desired partitions to the appropriate directories, based on the directory structure described in Sentinel Directory Structure. When you run the installer, Sentinel is installed into the pre-created directories resulting in an installation that spans multiple partitions.

NOTE:

You can use the --location option while running the installer to specify a different top-level location than the default directories to store the file. The value that you pass to the --location option is prepended to the directory paths. For example, if you specify --location=/foo, the data directory will be /foo/var/opt/novell/sentinel/data and the config directory will be /foo/etc/opt/novell/sentinel/config.
You must not use filesystem links (for example, soft links) for the --location option.

Using Partitions in Appliance Installations

If you are using the DVD ISO appliance format, you can configure the partitioning of the appliance filesystem during installation by following the instructions in the YaST screens. For example, you can create a separate partition for the /var/opt/novell/sentinel mount point to place all data on a separate partition. However, for other appliance formats, you can configure the partitioning only after installation. You can add partitions and move a directory to the new partition by using the SuSE YaST system configuration tool. For information about creating partitions after the installation, see Creating Partitions for Traditional Storage.

Best Practices for Partition Layout

Many organizations have their own documented best-practice partition layout schemes for any installed system. The following partition proposal is intended to guide organizations without any defined policy, and considers Sentinel specific use of the filesystem. Generally, Sentinel adheres to the Filesystem Hierarchy Standard where practicable.

Partition	Mount point	Size	Notes
Root	/	100GB	Contains operating system files and Sentinel binaries/configuration.
Boot	/boot	150MB	Boot partition
Primary storage	/var/opt/novell/sentinel	Calculate using the System Sizing Information.	This area will contain the primary Sentinel collected data, and other variable data such as log files. This partition can be shared with other systems.
Secondary storage	Location based on the type of storage, NFS, CIFS, or SAN.	Calculate using the System Sizing Information.	This is the secondary storage area, which can be mounted locally as shown or remotely.
Archival storage	Remote system	Calculate using the System Sizing Information.	This storage is for archived data.

Configuring the Visualization Data Store

Sentinel provides event visualizations that present data in charts, tables, and maps. These visualizations make it easier to visualize and analyze large volumes of events. You can also create your own visualizations and dashboards.

Sentinel leverages Kibana, a browser-based analytics and search dashboard, that helps you to search and visualize events. Kibana accesses data from visualization data store (Elasticsearch) to present events in dashboards. By default, Sentinel includes an Elasticsearch node that stores and indexes only alerts. You must enable event visualization to store and index events in Elasticsearch.

When you enable Elasticsearch to store and index data, Sentinel indexes only some specific event fields required for visualizations and stores the indexed fields in Elasticsearch. Sentinel creates a dedicated index for each day and uses the UTC timezone (midnight-midnight) to calculate the index date. The index name is in the security.events.normalized_yyyyMMdd format. For example, the index security.events.normalized_20160101 contains all events that with an event time of January 01, 2016.

Configuring the visualization data store involves the following:

Installing Elasticsearch nodes in a cluster mode: By default, Sentinel includes an Elasticsearch node. For optimal performance and stability of the Sentinel server, it is mandatory that you install additional Elasticsearch nodes in a cluster mode. For more information, see Section 12.0, Installing and Configuring Elasticsearch.
Enable event visualization: Event visualization is disabled by default. To enable event visualization, see Section 20.0, Enabling Event Visualization.
Performance tuning: Sentinel automatically configures certain Elasticsearch settings for optimal performance. You can customize these settings as needed. For example, you can modify the event fields you want Elasticsearch to index. For more information, see Performance Tuning for Elasticsearch.

6.1.2 Planning for Scalable Storage

Sentinel uses Cloudera’s Distribution Including Apache Hadoop (CDH) framework to store and manage large data. For indexing events, Sentinel uses a scalable, distributed indexing engine called Elasticsearch from Elastic.

The following illustration explains the various components used in scalable storage:

Figure 6-1 Scalable Storage Architecture

Messaging: Sentinel uses Apache Kafka as the scalable messaging system that receives normalized events and raw data from Collector Managers. Collector Managers send raw data and event data to Kafka clusters.

By default, Sentinel creates the following Kafka topics:
- security.events.normalized: Stores all the processed and normalized event data including system generated events and internal events.
- security.events.raw: Stores all the raw data from the event sources.
Event and raw data follow the Apache Avro schema. For more information, see Apache Avro documentation. The schema files are available in the /etc/opt/novell/sentinel/scalablestore directory.
Worker: This node hosts real-time processing and storage jobs. Apache Spark does large-scale data processing in real time such as segregating events based on tenant IDs, requesting large volume of data and storing data to system of record (SOR), and scalable indexing.

Apache HBase is a distributed and scalable Hadoop-based data store. It is used as an SOR for normalized events and raw data, segregated by tenant IDs.

Based on the tenant ID, Sentinel creates a separate namespace for each tenant. For example, the namespace for the default tenant is 1. Under each namespace, Sentinel creates the following tables and stores data based on the event time.
- <tenant_ID>:security.events.normalized: Stores all the processed and normalized event data including system generated events and internal events.
- <tenant_ID>:security.events.raw: Stores all the raw data from the event sources.
Cluster Management: This node hosts all the masters and cluster management services. Apache ZooKeeper acts as a centralized service for maintaining configuration information, naming services, providing distributed synchronization, and providing group services.
Indexing: Sentinel uses Elasticsearch as the scalable and distributed indexing engine for indexing events. You can access data from Elasticsearch for searching and visualizing events.

Sentinel creates a dedicated index for each day and uses the UTC timezone (midnight-midnight) to calculate the index date. The index name is in the security.events.normalized_yyyyMMdd format. For example, the index security.events.normalized_20160101 contains all events that with an event time of January 01, 2016. For optimal performance, Sentinel indexes only some specific event fields. You can modify the event fields you want Elasticsearch to index. For more information, see Performance Tuning for Elasticsearch.

Scalable Storage Configuration

When you enable scalable storage, the Sentinel server user interface is trimmed down to just cater to some of the Sentinel features such as data collection, correlation, event routing, search and visualize events, and perform certain administrative activities. This trimmed down version of Sentinel is referred to as Sentinel Scalable Data Manager (SSDM). For other Sentinel capabilities such as Security Intelligence, conventional searching, and reporting, you must install separate instances of Sentinel with traditional storage and route the specific event data from SSDM to Sentinel by using Sentinel Link.

The following list provides information about the services and features not available in SSDM:

Reports
Security Intelligence
Performing event operations during search
Testing correlation rules
Incident creation and management
Manually performing Actions on events
Data Synchronization
iTRAC Workflows
Forensic analysis on the events that trigger the correlated event
Viewing event attachments for Secure Configuration Manager and Change Guardian events

Enabling scalable storage is a one-time configuration, which cannot be reverted. If you want to disable scalable storage and switch to traditional storage, you must re-install Sentinel.

The following checklist provides a high-level information about the tasks you need to perform to configure scalable storage:

Table 6-2 Scalable Storage Configuration Checklist

	Tasks	See
	Review the deployment information to understand how you need to deploy Sentinel with scalable storage.	Three-Tier Deployment with Scalable Storage
	Review the prerequisites and complete all the required tasks.	Section 13.0, Installing and Setting Up Scalable Storage.
	Enable scalable storage. You can enable scalable storage either during installation or post-installation. In upgrade installations, you can enable scalable storage only after you upgrade Sentinel.	To enable scalable storage during installation, perform a custom installation of Sentinel. See Sentinel Server Custom Installation. To enable scalable storage post-installation or post-upgrade, see Enabling Scalable Storage Post-Installation in the Sentinel Administration Guide.
	Configure CDH components and Elasticsearch with Sentinel.	Configuring Scalable Storage in the Sentinel Administration Guide.

6.1.3 Sentinel Directory Structure

By default, the Sentinel directories are in the following locations:

The data files are in /var/opt/novell/sentinel/data and /var/opt/novell/sentinel/3rdparty directories.
Executables and libraries are stored in the /opt/novell/sentinel directory.
Log files are in the /var/opt/novell/sentinel/log directory.
Temporary files are in the /var/opt/novell/sentinel/tmp directory.
Configuration files are in the /etc/opt/novell/sentinel directory.
The process ID (PID) file is in the /home/novell/sentinel/server.pid directory.

Using the PID, administrators can identify the parent process of Sentinel server and monitor or terminate the process.