1.5 Troubleshooting General VM Management Issues

The following sections provide solutions to the problems you might encounter while working with general VM management operations:

Volume Tools Hang While Scanning a Suspended Device

Source: Scanned device.
Explanation: When a mapped device is in a suspended state, volume tools such as vgscan, lvscan, and pvscan hang. If the vmprep job is run on such a device, it throws an error such as the following to alert you to the condition:
vmquery: /var/adm/mount/vmprep.df8fd49401e44b64867f1d83767f62f5: Failed to
mount vm image "/mnt/nfs_share/vms/rhel4tmpl2/disk0": Mapped device
/dev/mapper/loop7p2 appears to be suspended. This might cause scanning for
volume groups (e.g. vgscan) to hang.
WARNING! You may need to manually resume or remove this mapped device (e.g.
dmsetup remove /dev/mapper/loop7p2)!
Action: Because of this behavior, we recommend against using LVM and similar volume tools on a virtual machine managed by PlateSpin Orchestrate.

SUSE Linux VMs Might Attempt To Partition a Read-only Device

Source: YaST Partitioner.
Explanation: When you build a SUSE Linux VM and specify a read-only virtual device for that VM, in some instances the YaST partitioner might propose a re-partitioning of the read-only virtual device.
Possible Cause: Although Xen normally attempts to notify the guest OS kernel about the mode (ro or rw) of the virtual device, under certain circumstances the YaST partitioner proposes a re-partitioning of the virtual device that has the most available disk space without considering the other device attributes. For example, if a specified CD-ROM device happens to be larger than the specified hard disk device, YaST attempts to partition the CD-ROM device, which causes the VM installation to fail.
Action: To work around this issue, connect a VNC console to the VM being built during the first stage of the VM install, then verify the partition proposal before you continue with the installation. If the partition proposal has selected an incorrect device, manually change the selected device before you continue with the installation of the VM.

RHEL 5 VMs Running the Kudzu Service Do Not Retain Network Interface Changes

Source: Kudzu service.
Explanation: Anytime you modify the hardware configuration (for example, changing the MAC address or adding a network interface card) of a RHEL 5 VM that is running the Kudzu hardware probing library, the VM does not retain the existing network interface configuration.
Possible Cause: When you start a RHEL 5 VM, the Kudzu service recognizes the hardware changes at boot time and moves the existing configuration for that network interface to a backup file. The service then rewrites the network interface configuration to use DHCP instead.
Action: To work around this problem, disable the Kudzu service within the RHEL VM by using the chkconfig --del kudzu command.

Policies Applied to VM Resources Are Deleted

Source: VM clones awaiting provisioning.
Explanation: Provisioning code requires that VMs and VM clones be standalone (that is, they are removed from a template dependency and are no longer considered to be “linked clones”).
Possible Cause: VMs in PlateSpin Orchestrate 2.5 and later must be made standalone to receive and retain associated policies.
Action: Apply a conditional policy to the parent template that can be applied to the clones while they are running. Depending upon the facts set on the clone, the inherited VM host constraint can be conditionally applied to the clone.

The following is an example of a conditional policy that you could apply to the VM template to restrict vmhost based on resource attributes (group membership, etc. ).

<policy>
    <constraint type="vmhost">
        <if>
            <contains fact="resource.groups" value="exclude_me"
                      reason="Only apply this vmhost constraint to resources NOT in exclude_me resource group" >
            </contains>
            <else>
                <if>
                    <defined fact="resource.some_boolean_fact" />
                    <eq fact="some_boolean_fact" value="true" />
                    <then>
                        <contains fact="vmhost.resource.groups" value="first_vmhost_group"
                                reason="When a resource is not in the exclude_me group, when some_ boolean_fact is true,
                                        provision to a vmhost in the first_vmhost_group"/>
                    </then>
                    <else>
                        <if>
                            <defined fact="resource.some_other_boolean_fact" />
                            <eq fact="some_other_boolean_fact" value="true" />
                            <not>
                                <and>
                                    <eq fact="resource.id" value="never_use_this_resource"
                                      reason="Specifically exclude this resource from consideration." />
                                    <or>
                                        <eq fact="vmhost.cluster"
                                            factvalue="resource.provision.vmhost.cluster" />
                                        <eq fact="vmhost.cluster"
                                            factvalue="resource.provision.vmhost" />
                                    </or>
                                </and>
                            </not>
                            <then>
                                <contains fact="vmhost.resource.groups" value="another_vmhost_group"
                                        reason="When a resource is not in the exclude_me group, when some_ boolean_fact is false,
                                                and some_other_boolean_fact is true, (but also not some other things),
                                                provision to a vmhost in another_vmhost_group"/>
                            </then>
                        </if>
                    </else>
                </if>
            </else>
        </if>
    </constraint>
</policy>

VMs Provisioned from a VM Template Are Not Restarted When a VM Host Crashes

Source: VM host with VMs provisioned from a template.
Explanation: If a VM host crashes, VMs that were provisioned from a template on that host are not restarted on another active VM host. Instead, PlateSpin Orchestrate provisions another VM cloned from the original template, on the next available host. The disk files of the original clone are not destroyed (that is, “cleaned up”) after the crash, but the original VM template files are destroyed.

If a Discover Repository action is issued before the cloned VM is deleted from the crashed host, Orchestrate creates a new VM object with the zombie_ string prepended to the VM object name.

Possible Cause: While hosting a provisioned clone, the VM host crashed or the Orchestrate Agent on that host went offline.
Action: To work around this issue, you can either remove the VM from the file system before Orchestrate rediscovers it, or you can issue a Destroy action on the discovered “zombie” VM.