3.3 Deploying Analytics Server for High Availability

This section provides information about how to install Analytics Server in an Active-Passive High Availability mode, which allows the server to fail over to a redundant cluster node in case of hardware or software failure. For more information on implementing high availability and disaster recovery in the Analytics Server environment, contact NetIQ Technical Support.

High availability refers to a design methodology that is intended to keep a system available for use as much as is practicable. The intent is to minimize the causes of downtime such as system failures and maintenance, and to minimize the time it will take to detect and recover from downtime events that do occur. In practice, automated means of detecting and recovering from downtime events quickly become necessary as higher levels of availability must be achieved.

For more information about high availability, see the SUSE High Availability Guide.

3.3.1 Prerequisite

When allocating cluster resources to support a high availability (HA) installation, consider the following requirements:

  • Ensure that each cluster node that hosts the Analytics Server services meet the requirements specified in Section 3.1.1, System Requirements.

  • Ensure that sufficient shared storage is available for the Analytics Server data and application.

  • Ensure that you use a virtual IP address for the services that can be migrated from node to node on failover.

  • Ensure that your shared storage device meets the performance and size characteristics requirements. It is recommended to use a standard SUSE Linux VM configured with iSCSI Targets as shared storage.

  • Ensure that you have only two cluster nodes that meet the resource requirements for running Analytics Server in the cluster environment.

  • Ensure you have a virtual IP that can be migrated from one node to another node in a cluster to serve as the external IP address for Analytics Server.

  • Ensure you create a method for the cluster nodes to communicate with the shared storage, such as FibreChannel for a SAN. NetIQ recommends a dedicated IP address to connect to the iSCSI Target.

  • Ensure you have at least one IP address per cluster node for internal cluster communications. NetIQ recommends a simple unicast IP address, but multicast is preferred for production environments.

3.3.2 Installing Analytics Server in High Availability Environment

To install Analytics Server in High Availability mode, perform the following:

Initial Setup

Configure the computer hardware, network hardware, storage hardware, operating systems, user accounts, and other basic system resources per the documented requirements for Analytics Server and local customer requirements. Test the systems to ensure proper function and stability.

Use the following checklist to guide you through initial setup and configuration:

  • The CPU, RAM, and disk space characteristics for each cluster node must meet the system requirements defined in Section 3.1.1, System Requirements based on the expected event rate.

  • If you want to configure the operating system firewalls to restrict access to Analytics Server and the cluster, refer to Table 1-2, for details of which ports must be available depending on your local configuration and the sources that will be sending event data.

  • Ensure that all cluster nodes are time-synchronized. You can use NTP or a similar technology for this purpose.

  • The cluster requires reliable host name resolution. Enter all internal cluster host names into the /etc/hosts file to ensure cluster continuity in case of DNS failure.

  • Ensure that you do not assign a host name to a loopback IP address.

  • When configuring host name and domain name while installing the operating system, deselect Assign Hostname to Loopback IP.

  • The nodes will have one NIC for external access and one for iSCSI communications.

  • Configure the external NICs with IP addresses that allow for remote access through SSH or similar. For this example, we will use 172.16.0.1 (node01) and 172.16.0.2 (node02).

  • Each node should have sufficient disk for the operating system, and Analytics Server. For system requirements, see Section 3.1.1, System Requirements.

  • One SUSE Linux 11 SP3 VM configured with iSCSI Targets for shared storage

    • (Conditional) You can install X Windows if you require GUI configuration. Set the boot scripts to start without X (runlevel 3), so you can start them only when needed.

    • The system will have two NICs: one for external access and one for iSCSI communications.

    • Configure the external NIC with an IP address that allows for remote access using SSH or similar. For example, 172.16.0.3 (storage03).

    • The system should have sufficient space for the operating system, temp space, a large volume for shared storage to hold Sentinel data, and a small amount of space for an SBD partition. See the SUSE Linux system requirements, and Sentinel event data storage requirements.

  • Perform the steps mentioned in the following sections.

NOTE:In a production cluster, you can use internal, non-routable IPs on separate NICs (possibly a couple, for redundancy) for internal cluster communications.

Shared Storage Setup

Set up your shared storage and make sure that you can mount it on each cluster node. If you are using FibreChannel and a SAN, you might need to provide physical connections as well as additional configuration. Analytics Server uses this shared storage to store the databases and event data. Ensure that the shared storage is appropriately sized accordingly based on the expected event rate and data retention policies.

A typical implementation might use a fast SAN attached using FibreChannel to all the cluster nodes, with a large RAID array to store the local event data. As long as the cluster node can mount the shared storage as a normal block device, it can be used by the solution.

NOTE:NetIQ recommends that you configure your shared storage and test mounting it on each cluster node. However, the cluster configuration will handle the actual mount of the storage.

NetIQ recommends using the following procedure to create iSCSI Targets hosted by a SLES VM:

  1. Connect to storage03, the VM you created during Initial Setup, and start a console session.

  2. Use the dd command to create a blank file of any desired size for Analytics Server shared storage. For creating a 20 GB file filled with zeros copied from the /dev/zero pseudo-device file, run the following command:

    dd if=/dev/zero of=/localdata count=20480000 bs=1024

Configure iSCSI targets

Configure the files localdata as iSCSI Targets:

  1. Run YaST from the command line (or use the Graphical User Interface, if preferred): /sbin/yast

  2. Select Network Devices > Network Settings.

  3. Ensure that the Overview tab is selected.

  4. Select the secondary NIC from the displayed list, then tab forward to Edit and press Enter.

  5. On the Address tab, assign a static IP address of 10.0.0.3. This will be the internal iSCSI communications IP.

  6. Click Next, then click OK.

  7. On the main screen, select Network Services > iSCSI Target.

  8. If prompted, install the required software (iscsitarget RPM) from the SUSE Linux 11 SP3 media.

  9. Click Service, select the When Booting option to ensure that the service starts when the operating system boots.

  10. Click Global, and then select No Authentication because the current OCF Resource Agent for iSCSI does not support authentication.

  11. Click Targets and then click Add to add a new target.

    The iSCSI Target will auto-generate an ID and then present an empty list of LUNs (drives) that are available.

  12. Click Add to add a new LUN.

  13. Leave the LUN number as 0, then browse in the Path dialog (under Type=fileio) and select the /localdata file that you created. If you have a dedicated disk for storage, specify a block device, such as /dev/sdc.

  14. Leave the other options at their defaults. Click OK and then click Next.

  15. Click Next again to select the default authentication options, then Finish to exit the configuration. Accept if asked to restart iSCSI.

  16. Exit YaST.

NOTE:This procedure exposes two iSCSI Targets on the server at IP address 10.0.0.3. At each cluster node, ensure that it can mount the local data shared storage device.

Configuring iSCSI initiators

Use the following procedure to format the devices:

  1. Connect to one of the cluster nodes (node01) and start YaST.

  2. Select Network Devices > Network Settings.

  3. Ensure that the Overview tab is selected.

  4. Select the secondary NIC from the displayed list, then tab forward to Edit and press Enter.

  5. Click Address, assign a static IP address of 10.0.0.1. This will be the internal iSCSI communications IP.

  6. Select Next, then click OK.

  7. Click Network Services > iSCSI Initiator.

  8. If prompted, install the required software (open-iscsi RPM) from the SUSE Linux 11 SP3 media.

  9. Click Service, select When Booting to ensure the iSCSI service is started on boot.

  10. Click Discovered Targets, and select Discovery.

  11. Specify the iSCSI Target IP address (10.0.0.3), select No Authentication, and then click Next.

  12. Select the discovered iSCSI Target with the IP address 10.0.0.3 and then select Log In.

  13. Switch to automatic in the Startup drop-down and select No Authentication, then click Next.

  14. Switch to the Connected Targets tab to ensure that we are connected to the target.

  15. Exit the configuration. This should have mounted the iSCSI Targets as block devices on the cluster node.

  16. In the YaST main menu, select System > Partitioner.

  17. In the System View, you should see new hard disk (such as /dev/sdb) in the list - they will have a type of IET-VIRTUAL-DISK. Tab over to this hard disk, select it, then press Enter.

  18. Select Add to add a new partition to the empty disk. Format the disk as a primary ext3 partition, but do not mount it. Ensure that the option Do not mount partition is selected.

  19. Select Next, then Finish after reviewing the changes that will be made. Assuming you create a single large partition on this shared iSCSI LUN, you should end up with a /dev/sdb1 or similar formatted disk (referred to as /dev/<SHARED1> below).

  20. Exit YaST.

  21. Repeat steps 1-15 to ensure that each cluster node can mount the local shared storage. Replace the node IP in step 5 with a different IP for each cluster node.

Configuring Analytics Server for High Availability on the Nodes

Install Analytics Server to each cluster node that can host it. After you install Analytics Server the first time, you must perform a complete installation including the application binaries, configuration, and all the data stores. For subsequent installations on the other cluster nodes, you will only install the application. The Analytics Server data will be available once you have mounted the shared storage.

First node

To configure Analytics Server for HA, perform the following steps:

  1. Connect to one of the cluster nodes (node01) and open a console window.

  2. Navigate to the following directory:

    cd /opt/novell/sentinel/setup
  3. Record the configuration:

    1. Execute the following command:

      ./configure.sh --record-unattended=/tmp/install.props --no-start

      This step records the configuration in the file install.props,which is required to configure the cluster resources using the install-resources.sh script.

    2. Specify the Standard configuration option for Analytics Server Configuration method.

    3. Specify 2 to enter a new password.

      Even if you require to use the existing password, choose 2 so that the password is stored and synchronized with the other node.If you specify 1, the install.props file does not store the password.

  4. Shut down Analytics Server services by using the following commands:

    /etc/init.d/novell-offline stop
    /etc/init.d/novell-realtime stop
    /etc/init.d/novell-jcc stop
    rcsentinel stop
    insserv -r sentinel
  5. Move the Analytics Server data folder to the shared storage using the following commands. This movement allows the nodes to utilize the Analytics Server data folder through shared storage.

    mkdir -p /tmp/new
    mount /dev/<SHARED1> /tmp/new
    mv /var/opt/novell/sentinel /tmp/new
    umount /tmp/new/
  6. Verify the movement of the Analytics Server data folder to the shared storage using the following commands:

    mount /dev/<SHARED1>  /var/opt/novell/
    umount /var/opt/novell/

Subsequent node

  1. Connect to second cluster node (node02) and open a console window.

  2. Execute the following command:

    insserv -r sentinel
  3. Stop Analytics Server services.

    rcsentinel stop
  4. Remove Analytics Server directory.

    rm -rf /var/opt/novell/sentinel

At the end of this process, Analytics Server should be installed on all nodes, but it will likely not work correctly on any but the first node until various keys are synchronized, which will happen when we configure the cluster resources.

Cluster Installation and Configuration

Analytics Server includes the cluster software and does not require manual installation.

As part of this configuration, you can also set up fencing and Shoot The Other Node In The Head (STONITH) resources to ensure cluster consistency.

For this solution you must use private IP addresses for internal cluster communications, and use unicast to minimize the need to request a multicast address from a network administrator. You must also use an iSCSI Target configured on the same SUSE Linux VM that hosts the shared storage to serve as a Split Brain Detection (SBD) device for fencing purposes.

NetIQ Recommends the following procedure for cluster configuration:

SBD Setup

  1. Connect to storage03 and start a console session. Use the dd command to create a blank file of any desired size:

    dd if=/dev/zero of=/sbd count=1024 bs=1024
  2. Create a 1MB file filled with zeros copied from the /dev/zero pseudo-device.

  3. Run YaST from the command line or the Graphical User Interface: /sbin/yast

  4. Select Network Services > iSCSI Target.

  5. Click Targets and select the existing target.

  6. Select Edit. The UI will present a list of LUNs (drives) that are available.

  7. Select Add to add a new LUN.

  8. Leave the LUN number as 1. Browse in the Path dialog and select the /sbd file that you created.

  9. Leave the other options at their defaults, then select OK then Next, then click Next again to select the default authentication options.

  10. Click Finish to exit the configuration. Restart the services if needed. Exit YaST.

NOTE:The following steps require that each cluster node be able to resolve the hostname of all other cluster nodes (the file sync service csync2 will fail if this is not the case). If DNS is not set up or available, add entries for each host to the /etc/hosts file that list each IP and its hostname (as reported by the hostname command). Also, ensure that you do not assign a hostname to a loopback IP address.

Perform the following steps to expose an iSCSI Target for the SBD device on the server at IP address 10.0.0.3 (storage03).

Node Configuration

Connect to a cluster node (node01) and open a console:

  1. Run YaST.

  2. Open Network Services > iSCSI Initiator.

  3. Select Connected Targets, then the iSCSI Target you configured above.

  4. Select the Log Out option and log out of the Target.

  5. Switch to the Discovered Targets tab, select the Target, and log back in to refresh the list of devices (leave the automatic startup option and No Authentication).

  6. Select OK to exit the iSCSI Initiator tool.

  7. Open System > Partitioner and identify the SBD device as the 1MB IET-VIRTUAL-DISK. It will be listed as /dev/sdc or similar - note which one.

  8. Exit YaST.

  9. Execute the command ls -l /dev/disk/by-id/ and note the device ID that is linked to the device name you located above.

  10. Execute the command sleha-init.

  11. When prompted for the network address to bind to, specify the external NIC IP (172.16.0.1).

  12. Accept the default multicast address and port. We will override this later.

  13. Enter 'y' to enable SBD, then specify /dev/disk/by-id/<device id>, where <device id> is the ID you located above (you can use Tab to auto-complete the path).

  14. Complete the wizard and make sure no errors are reported.

  15. Start YaST.

  16. Select High Availability > Cluster (or just Cluster on some systems).

  17. In the box at left, ensure Communication Channels is selected.

  18. Tab over to the top line of the configuration, and change the udp selection to udpu (this disables multicast and selects unicast).

  19. Select to Add a Member Address and specify this node (172.16.0.1), then repeat and add the other cluster node(s): 172.16.0.2.

  20. Select Finish to complete the configuration.

  21. Exit YaST.

  22. Run the command /etc/rc.d/openais restart to restart the cluster services with the new sync protocol.

Connect to other cluster node (node02) and open a console:

  1. Run YaST.

  2. Open Network Services > iSCSI Initiator.

  3. Select Connected Targets, then the iSCSI Target you configured above.

  4. Select the Log Out option and log out of the Target.

  5. Switch to the Discovered Targets tab, select the Target, and log back in to refresh the list of devices (leave the automatic startup option and No Authentication).

  6. Select OK to exit the iSCSI Initiator tool.

  7. Run the following command: sleha-join

    Enter the IP address of the first cluster node.

  8. Run crm_mon on each cluster node to verify that the cluster is running properly.

(Conditional) If the cluster does not start correctly, perform the following steps:

  1. Manually copy /etc/corosync/corosync.conf from node01 to node02, or run csync2 -x -v on node01, or manually set the cluster up on node02 through YaST.

  2. Run /etc/rc.d/openais start on node02

    (Conditional) If the xinetd service does not properly add the new csync2 service, the script will not function properly. The xinetd service is required so that the other node can sync the cluster configuration files down to this node. If you see errors like csync2 run failed, you may have this problem.

    To resolve this issue, execute the kill -HUP `cat /var/run/xinetd.init.pid command and then re-run the sleha-join script.

  3. Run crm_mon on each cluster node to verify that the cluster is running properly. You can also use 'hawk', the web console, to verify the cluster. The default login name is hacluster and the password is linux.

Perform the following tasks to modify additional parameters:

  1. To ensure that in a single node failure in your two-node cluster does not unexpectedly stop the entire cluster, set the global cluster option no-quorum-policy to ignore:

    crm configure property no-quorum-policy=ignore

  2. To ensure that the resource manager allows resources to run in place and move, set the global cluster option default-resource-stickiness to 1:

    crm configure property default-resource-stickiness=1

Resource Configuration

Resource Agents are provided by default with SLE HAE. If you do not want to use SLE HAE, you need to monitor these additional resources using an alternate technology:

  • A filesystem resource corresponding to the shared storage that the software uses.

  • An IP address resource corresponding to the virtual IP by which the services will be accessed.

  • The PostgreSQL database software that stores configuration and event metadata.

NetIQ recommends the following for resource configuration:

NetIQ provides a crm script to aid in cluster configuration. The script pulls relevant configuration variables from the unattended setup file generated as part of the Analytics Server installation. If you did not generate the setup file, or you wish to change the configuration of the resources, you can use the following procedure to edit the script accordingly.

The proceeding steps must be performed on the nodes on which Analytics Server is installed.

  1. Run the following commands on both nodes:

    • mkdir -p /var/log/agg-analytics-var

    • mkdir -p /var/log/agg-analytics-var/nam/logs/dashboard/tomcat

    • chown -R novell.novell /var/log/agg-analytics-var/

  2. On the primary node, run the following commands in the same sequence, where <SHARED1> is the shared volume you created previously:

    mount /dev/<SHARED1> /var/opt/novell
    ln -s /var/log/agg-analytics-var/nam /var/opt/novell/nam
    cd /usr/lib/ocf/resource.d/novell
    ./install-resources.sh

    The install-resources.sh script will prompt you for a couple values, namely the virtual IP that you would like people to use to access Analytics Server and the device name of the shared storage, and then will auto-create the required cluster resources. Note that the script requires the shared volume to already be mounted, and also requires the unattended installation file which was created during Analytics Server install to be present (/tmp/install.props). You do not need to run this script on any but the first installed node; all relevant config files will be automatically synced to the other nodes.

    After running the install-resources script, if you get the error for the file syncronization, then run csync2 -x -v. If you get the message that file is up to date and is with no errors, then you can continue the deployment.

  3. Run the following commands on the primary node:

    • /etc/init.d/novell-offline start

    • /etc/init.d/novell-realtime start

    • /etc/init.d/novell-jcc start

  4. Run /etc/rc.d/openais restart on node02.

  5. If your environment varies from this NetIQ recommended solution, you can edit the resources.cli file (in the same directory) and modify the primitives definitions from there. For example, the recommended solution uses a simple Filesystem resource; you may wish to use a more cluster-aware cLVM resource.

  6. After running the shell script, you can issue a crm status command and the output should look like this:

    crm status
    Last updated: Thu Jul 26 16:34:34 2012
    Last change: Thu Jul 26 16:28:52 2012 by hacluster via crmd on node01
    Stack: openais
    Current DC: node01 - partition with quorum
    Version: 1.1.6-b988976485d15cb702c9307df55512d323831a5e
    2 Nodes configured, 2 expected votes
    5 Resources configured.
    Online: [ node01, node02 ]
    stonith-sbd    (stonith:external/sbd):    Started node01
     Resource Group: sentinelgrp
         sentinelip    (ocf::heartbeat:IPaddr2):    Started node01
         sentinelfs    (ocf::heartbeat:Filesystem):    Started node01
         sentineldb    (ocf::novell:pgsql):    Started node01
         sentinelserver    (ocf::novell:sentinel):    Started node01
  7. Add the following entry to /etc/csync2/csync2.cfg:

    include /etc/opt/novell/sentinel/config/auth.login;
    include /etc/opt/novell/nam/sentinelReportAdminConfig;
  8. Run csync2 -x -v.

    Run the command again if you get any error.

NOTE:The preceeding steps are used for installing and configuring the cluster for high availability. However, to use this cluster configuration, you must enable clustering from Administration Console. For more information, refer Section 3.3.3, Post-Installation Cluster Configuration for Analytics Server.

3.3.3 Post-Installation Cluster Configuration for Analytics Server

After performing the cluster installation steps, you can view the active and standby nodes in Administration Console. But to use Analytics Server in high availability mode you must perform the following in Administration Console in the same sequence:

Enable Clustering

  1. Click Devices > Analytics Server.

  2. In the cluster row, click Edit.

  3. In the Cluster Configuration section, set the Clustering is Configured: field to Yes and specify the same virtual IP address in the Cluster’s Virtual IP: field that you had specified during resource configuration. For more information, seeStep 2.

Configure the settings for auditing

Analytics Server will be functional only when it is set as the audit server in Administration Console. To set it as the audit server perform the following:

  1. Perform the steps mentioned in Enable Clustering.

  2. In the Admin Tasks pane, click Auditing.

  3. In the Audit Messages Using: field, Select Syslog: > Send to Analytics Server

    The Server Listening Address gets auto-populated with the virtual IP address that you had specified when performing step 1.