6.3 Implementing a High Availability/Disaster Recovery (Hot/Hot) Configuration

This section describes a standard methodology for creating and maintaining a High Availability/Disaster Recovery configuration for a Operations Center implementation. It serves as a starting point for the creation of a site-specific High Availability implementation plan that can be customized, as required, for your environment.

A Hot/Hot type of architecture is required to implement a High Availability (HA) configuration of “high nines.” This requires that there be two (at a minimum) identically configured systems that are up and running and available to users, as well as a separate Disaster Recovery platform. For example:

Figure 6-2 High Availability/Disaster Recovery Configuration

6.3.1 Servers

Table 6-1 describes the recommended configuration to meet the HA requirements, as well as additional requirements around system upgrades, maintenance, and so on.

Table 6-1 HA Configuration Requirements

Server Sites

Servers Required

Main Site

Primary Server: An Operations Center Server with all adapters, Service Models, Users/Groups, SCM definitions, BSLM definitions and Dashboards.

Backup Server: One additional Server with a “mirrored” image of the Primary Server.

Disaster Recovery Site

Disaster Recovery Server: An additional Server, a “mirrored” image of the Primary Server.

For the above architecture:

  • All Operations Center Servers are configured with the Clustering option turned on and use the same backend databases for the configStore and Service Warehouse

  • The backend database requires a separate, third-party Hot/Hot clustering technology specific to that database implementation

6.3.2 Load Balancers for Traffic Management

The use of a load balancer is recommended for traffic management with a Hot/Hot solution. A local load balancer is placed in front of the Primary and Backup Server in the Main Site. An additional load balancer is added in front of the Main Site to direct users to the Disaster Recovery site in the event of a complete site outage.

Users log in to the Operations Center server via the Site Load Balancer. The Site Load Balancer directs the user either to the Primary (if running) Operations Center server or to the Backup Operations Center server if the Primary is down. If both the Primary and Backup servers are down, users are directed to the DR Operations Center server.

While load balancing solutions are available from vendors, such as F5Networks, Cisco, and Symantec, Operations Center does not recommend any particular vendor, brand, make or model.

6.3.3 Virtual Hosts for Traffic Management

This section describes the steps required to use a load balancer with a virtual host.

To configure a load balancer to work with a virtual host:

  1. In the customizer, set the host name to the virtual host name and set the IP address to the actual IP address of the server.

  2. In the unit.xml file located in NOC/SelfTestScripts/NOC, locate the tests with className="com.mosol.selftest.tests.LocalHost" and set critical="false" and mandatory="false".

  3. Edit the Formula.custom.properties file located in the config folder to have the following settings:



  4. In etc/hosts, add an entry for the real IP address and the virtual host name.

  5. Ensure that the DNS checks etc/hosts before it checks the DNS name server.

6.3.4 Backend Database

The backend database, that houses the configStore and Warehouse, requires its own High Availability configuration based on available offerings. If failover can be handled by another database server, instead of using backups to restore the database, downtime can be shortened considerably.

Review solutions offered by your database manufacturer as many vendors offer standard clustering options that can provide for either Hot/Cold or Hot/Warm database methodologies. For example, this can include a shared disk for the database files and two servers utilizing the shared disk subsystem to provide the database role.