1.6 Understanding VM Host Failover

When the Orchestration Server comes back online after being offline, it rediscovers the state of all resources, including VM hosts and the VMs running on those hosts. This section provides more information about how the Orchestration Server behaves when the VM Host loses its agent connection.

There are two possible scenarios that can occur when a VM Host fails while running VMs. The failover behavior depends on where the VM image is stored and whether the VM has the agent installed.

The following table shows possible failover scenarios with the VM Host and the expected server behavior when it occurs.

Table 1-1 Orchestration Server Behavior when the VM Host Loses Its Agent Connection


Failover Behavior

Scenario 1: The VM image is:

  • Stored on a non-local repository (for example, the zos repository)

  • Accessible by other VM hosts

  • Successfully provisioned

Situation: The VM host fails.

The VMs that had been running on the failed VM host are reprovisioned to other available VM hosts.

  • If the VM was provisioned from a template, there is now another instance of the VM. For example, if the template name is sles10template, the original VM provisioned from the template is then named sles10template-1.

    If the host running sles10template-1 goes down, or if it loses its agent connection, a new instance of the template named sles10template-2 is reprovisioned to an available host.

  • If the original VM was a standalone VM, it is reprovisioned to an available host.

Scenario 2: The VM image is stored on a local repository.

Situation: The VM host loses its agent connection.

  • Because the VM image is stored locally, it cannot be reprovisioned to another VM host.

  • When the VM host comes back online, it is reprovisioned to the host where it is stored.

In either of these scenarios, if the Orchestration Agent is installed on the VM and if the VM host loses its agent connection but the VMs retain their agent connection (for example, if someone kills the agent process on the VM host), no reprovisioning occurs.

If the VM host loses its agent connection, and if the Orchestration Agent is not installed on the running VMs, the VMs can continue running indefinitely. However, if the location of the VM image warrants it, the VMs are reprovisioned to other available hosts. When there are two (or more) of the same VM instance running on different VM hosts, the Orchestration Server is aware only of the VMs running on a VM host with an active agent connection, so the administrator must stop the VMs on the host that has lost its agent connection.

NOTE:If you are interested in failover in a high availability environment, see the NetIQ Cloud Manager 2.1.5 Orchestration Server High Availability Configuration Guide or the NetIQ Cloud Manager 2.1.5 SUSE Xen VM High Availability Configuration Guide.