5.6 Failover

A Failover results in the business function of a failed workload being taken over by a failover workload within a PlateSpin Forge VM container.

5.6.1 Detecting Offline Workloads

PlateSpin Forge constantly monitors your protected workloads. If an attempt to monitor a workload fails for a predefined number of times, PlateSpin Forge generates a Workload is offline event. Criteria that determine and log a workload failure are part of a workload protection’s Tier settings (see the Tier row in Workload Protection Details).

If notifications are configured along with SMTP settings, PlateSpin Forge simultaneously sends a notification e-mail to the specified recipients. See Setting Up Automatic E-Mail Notifications of Events and Reports.

If a workload failure is detected while the status of the replication is Idle, you can proceed to the Run Failover command. If a workload fails while an incremental is underway, the job stalls. In this case, abort the command (see Aborting Commands), and then proceed to the Run Failover command. See Performing a Failover.

The following figure shows the PlateSpin Forge Web Interface’s Dashboard page upon detecting a workload failure. Note the applicable tasks in the Tasks and Events pane:

Figure 5-1 The Dashboard Page upon Workload Failure Detection (‘Workload Offline’)

5.6.2 Performing a Failover

Failover settings, including the failover workload’s network identity and LAN settings, are saved together with the workload’s protection details at configuration time. See the Failover row in Workload Protection Details.

You can use the following methods to perform a failover:

  • Select the required workload on the Workloads page and click Run Failover.

  • Click the corresponding command hyperlink of the Workload is offline event in the Tasks and Events pane. See Figure 5-1.

  • Run a Prepare for Failover command to boot the failover VM ahead of time You still have the option to cancel the failover (useful in staged failovers).

Use one of these methods to start the failover process and select a recovery point to apply to the failover workload (see Recovery Points). Click Execute and monitor the progress. Upon completion, the replication status of the workload should indicate Live.

For testing the failover workload or testing the failover process as part of a planned disaster recovery exercise, see Using the Test Failover Feature.

5.6.3 Using the Test Failover Feature

PlateSpin Forge provides you with the capability to test the failover functionality and the integrity of the failover workload. This is done by using the Test Failover command, which boots the failover workload in a restricted network environment for testing.

When you execute the command, PlateSpin Forge applies the Test Failover Settings, as saved in the workload protection details, to the failover workload (see the Test Failover row in Workload Protection Details.

  1. Define an appropriate time window for testing and make sure that there are no replications underway. The replication status of the workload must be Idle.

  2. On the Workloads page, select the required workload, click Test Failover, select a recovery point (see Recovery Points), and the click Execute.

    Upon completion, PlateSpin Forge generates a corresponding event and a task with a set of applicable commands:

  3. Verify the integrity and business functionality of the failover workload. Use the VMware vSphere Client to access the failover workload in the appliance host.

    See Downloading the VMware Client Program.

  4. Mark the test as a failure or a success. Use the corresponding commands in the task (Mark Test Failure, Mark Test Success). The selected action is saved in the history of events associated with the workload and is retrievable by reports. Dismiss Task discards the task and the event.

    Upon completion of the Mark Test Failure or Mark Test Success tasks, PlateSpin Forge discards temporary settings that were applied to the failover workload, and the protection returns to its pre-test state.