6.12 Protecting Windows Clusters

PlateSpin Forge supports the protection of a Microsoft Windows cluster’s business services. The supported clustering technologies are:

  • Windows Server 2012 R2: Server-based Microsoft Failover Cluster (Node and Disk Majority Quorum and No Majority: Disk Only Quorum models)

  • Windows Server 2008 R2: Server-based Microsoft Failover Cluster (Node and Disk Majority Quorum and No Majority: Disk Only Quorum models)

  • Windows Server 2003 R2: Server-based Windows Cluster Server (Single-Quorum Device Cluster model)

You can enable or disable Windows cluster discovery for your PlateSpin environment. See Section 6.12.2, Enabling or Disabling Windows Cluster Discovery.

NOTE:The Windows cluster management software provides the failover and failback control for the resources running on its cluster nodes. This document refers to this action as a cluster node failover or a cluster node failback.

The PlateSpin Server provides the failover and failback control for the protected workload that represents the cluster. This document refers to this action as a Platespin failover or a PlateSpin failback.

6.12.1 Planning Your Cluster Workload Protection

Protection of a cluster is achieved through incremental replications of changes on the active node streamed to a virtual one node cluster, which you can use while troubleshooting the source infrastructure. Before you configure Windows clusters for protection, ensure that your environment meets the prerequisites and that you understand the conditions for protecting cluster workloads.

Prerequisites

The scope of support for cluster protection is subject to the following conditions:

  • Active node hostname or IP address: You must specify the hostname or IP address of the cluster’s active node when you perform an Add Workload operation. Because of security changes made by Microsoft, Windows clusters can no longer be discovered by using the virtual cluster name (that is, the shared cluster IP address).

  • Active node discovery: Ensure that the PlateSpin global configuration setting DiscoverActiveNodeAsWindowsCluster is set to True on the PlateSpin Server Configuration page. This is the default setting. See Section 6.12.2, Enabling or Disabling Windows Cluster Discovery.

  • Resource name search values: You must specify search values to use that can help PlateSpin Forge differentiate the name of the shared Cluster IP Address resource from the name of other IP address resources on the cluster. See Section 6.12.3, Adding Resource Name Search Values.

  • Resolvable hostname: The PlateSpin Server must be able to resolve the hostname of each of the nodes in the cluster.

    NOTE:The hostname must be resolvable by the IP address. That is, both hostname lookup and reverse lookup are required.

  • Quorum resource: A cluster’s quorum resource must be co-located on the node with the cluster’s resource group (service) being protected.

  • PowerShell 2.0: Windows PowerShell 2.0 Engine must be installed on each node of the cluster.

Block-Based Transfer

When you use block-based transfer for cluster workloads, the block-based driver components are not installed on the cluster nodes. The block-based transfer occurs using a driverless synchronization with an MD5-based replication. Because the block-based driver is not installed, no reboot is required on the source cluster nodes.

NOTE:File based transfer is not supported for protecting Microsoft Windows clusters.

Cluster Node Failover during the First Full Replication

A cluster workload requires that the first full replication completes successfully without a cluster node failover. If a cluster node failover occurs prior to the completion of the first full replication, you must remove the existing workload, re-add the cluster using the active node, and try again.

Cluster Node Failover during Replication

If a cluster node failover occurs prior to the completion of the copy process during a full replication or an incremental replication, the command aborts and a message displays indicating that the replication needs to be re-run.

Cluster Node Failover between Replications

The nodes must have similar profiles to prevent interruptions in the replication process. If a cluster node failover occurs between the incremental replications of a protected cluster and if the new active node’s profile is similar to the failed active node, the protection contract continues as scheduled for the next incremental replication. Otherwise, the next incremental replication command fails.

The profiles of cluster nodes are considered similar if all of the following conditions are met:

  • Serial numbers for the nodes’ local volumes (System volume and System Reserved volume) must be the same on each cluster node.

    NOTE:Use the customized Volume Manager utility to change the local volume serial numbers to match each node of the cluster. See Synchronizing Serial Numbers on Cluster Node Local Storage.

    If the local volumes on each node of the cluster have different serial numbers, you cannot run a replication after a cluster node failover occurs. For example, during a cluster node failover, the active node Node 1 fails, and the cluster software makes Node 2 the active node. If the local drives on the two nodes have different serial numbers, the next replication command for the workload fails.

  • The nodes must have the same number of volumes.

  • Each volume must be exactly the same size on each node.

  • The nodes must have an identical number of network connections.

Protection Setup

To configure protection for a Windows cluster, follow the normal workload protection workflow. Ensure that you provide the hostname or IP address of the cluster’s active node. See Basic Workflow for Workload Protection and Recovery.

6.12.2 Enabling or Disabling Windows Cluster Discovery

The PlateSpin Forge Server can discover and inventory Windows Server failover clusters in your PlateSpin environment based on the active node in each cluster. Alternatively, it can treat all active and non-active cluster nodes as standalone machines.

To enable cluster discovery for all Windows clusters, ensure that the parameter DiscoverActiveNodeAsWindowsCluster is set to True. This is the default setting. Cluster discovery, inventory, and workload protection use the hostname or IP address of a cluster’s active node, instead of using its cluster name and an administration share. You do not configure separate workloads for the cluster’s non-active nodes. For other cluster workload protection requirements, see Prerequisites.

To disable cluster discovery for all Windows clusters, set the parameter DiscoverActiveNodeAsWindowsCluster to False. This setting allows the PlateSpin Server to discover all nodes in a Windows failover cluster as standalone machines. That is, it inventories a cluster’s active node and non-active nodes as a regular, cluster-unaware Windows workloads.

To enable or disable cluster discovery:

  1. Go to the PlateSpin Server configuration page at

    https://<platespin-server-ip-address>/PlatespinConfiguration

  2. Search for DiscoverActiveNodeAsWindowsCluster, then click Edit.

  3. In the Value field, select True to enable cluster discovery, or select False to disable cluster discovery.

  4. Click Save.

6.12.3 Adding Resource Name Search Values

To help identify the active node in a Windows failover cluster, PlateSpin Forge must differentiate the name of the shared Cluster IP Address resource from the names of other IP address resources on the cluster. The shared Cluster IP Address resource resides on the cluster’s active node.

The global parameter MicrosoftClusterIPAddressNames on the PlateSpin Server Configuration page contains a list of search values to use in discovery for a Windows cluster workload. When you add a Windows cluster workload, you must specify the IP address of the cluster’s currently active node. PlateSpin Forge searches the names of the cluster’s IP address resources on that node to find one that starts with the specified characters of any value in the list. Thus, each search value must contain enough characters to differentiate the shared Cluster IP Address resource on a specific cluster, but it can be short enough to apply to discovery in other Windows clusters.

For example, a search value of Clust IP Address or Clust IP matches the resource names Clust IP Address for 10.10.10.201 and Clust IP Address for 10.10.10.101.

The default name for the shared Cluster IP Address resource is Cluster IP Address in English, or the equivalent if the cluster node is configured in another language. The default search values in the MicrosoftClusterIPAddressNames list include the resource name Cluster IP Address in English and each of the supported languages.

Because the resource name of the shared Cluster IP Address resource is user-configurable, you must add other search values to the list, as needed. If you change the resource name, you must add a related search value to the MicrosoftClusterIPAddressNames list. For example, if you specify a resource name of Win2012-CLUS10-IP-ADDRESS, you should add that value to the list. If you have multiple clusters using the same naming convention, an entry of Win2012-CLUS matches any resource name that starts with that sequence of characters.

To add search values in the MicrosoftClusterIPAddressNames list:

  1. Go to the PlateSpin Server configuration page at

    https://<platespin-server-ip-address>/PlatespinConfiguration

  2. Search for MicrosoftClusterIPAddressNames, then click Edit.

  3. In the Value field, add one or more search values to the list.

  4. Click Save.

6.12.4 Quorum Arbitration Timeout

You can set the QuorumArbitrationTimeMax registry key for Windows Server failover clusters in your PlateSpin environment by using the global parameter FailoverQuorumArbitrationTimeout on the PlateSpin Server Configuration page. The default timeout is 60 seconds, in keeping with the Microsoft default value for this setting. See QuorumArbitrationTimeMax on the Microsoft Developer Network website. The specified timeout interval is honored for quorum arbitration at failover and failback.

To set the quorum arbitration timeout for all Windows failover clusters:

  1. Go to the PlateSpin Server configuration page at

    https://<platespin-server-ip-address>/PlatespinConfiguration

  2. Search for FailoverQuorumArbitrationTimeout, then click Edit.

  3. In the Value field, specify the maximum number of seconds to allow for quorum arbitration.

  4. Click Save.

6.12.5 Setting Local Volume Serial Numbers

You can use the Volume Manager utility to change the local volume serial numbers to match in each node of the cluster. See Synchronizing Serial Numbers on Cluster Node Local Storage.

6.12.6 PlateSpin Failover

When the PlateSpin failover operation is complete and the virtual one-node cluster comes online, you see a multi-node cluster with one active node (all other nodes are unavailable).

To perform a PlateSpin failover (or to test the PlateSpin failover on) a Windows cluster, the cluster must be able to connect to a domain controller. To leverage the test failover functionality, you need to protect the domain controller along with the cluster. During the test, bring up the domain controller, followed by the Windows cluster workload (on an isolated network).

6.12.7 PlateSpin Failback

A PlateSpin failback operation requires a full replication for Windows Cluster workloads.

If you configure the PlateSpin failback as a full replication to a physical target, you can use one of these methods:

  • Map all disks on the PlateSpin virtual one-node cluster to a single local disk on the failback target.

  • Add another disk (Disk 2) to the physical failback machine. You can then configure the PlateSpin failback operation to restore the failover's system volume to Disk 1 and the failover's additional disks (previous shared disks) to Disk 2. This allows the system disk to be restored to the same size storage disk as the original source.

After a PlateSpin failback is complete, you must reattach the shared storage and rebuild the cluster environment before you can rejoin additional nodes to the newly restored cluster.

NOTE:When the cluster is at the stage of Ready To Reprotect, ensure that you first rebuild and restore the failback target so that it gets discovered as a cluster. You must manually uninstall the PlateSpin Cluster Driver as part of the rebuild process.

For information about rebuilding the cluster environment after a PlateSpin failover and failback occurs, see the following resources: