1.6 Troubleshooting vSphere VM Provisioning Actions

The following sections provide solution to the problems you might encounter while performing provisioning actions on VMs managed by the VMware vCenter hypervisor:

Unable to perform any Provisioning Adapter Action after the Save Config Action on the vSphere Managed VM

Source: The Orchestration Console.
Possible Cause: The VM UUID value of the vSphere managed VM is not a 128-bit hexadecimal value. Even though the Save Config action is successful and the VM is provisioned, the hypervisor automatically assigns a different UUID value. Subsequently, any provisioning adapter action performed on the VM fails.
Action: Specify a 128-bit hexadecimal value for the VM UUID.
  1. In the Orchestration Console, click Resources > the vSphere managed VM.

    The Info/Groups tab is displayed by default.

  2. In the Virtual Machine Configuration panel, set the value of VM UUID to a 128-bit hexadecimal value.

  3. Right-click the vSphere managed VM, then click Save Config.

(503) Service Unavailable Errors Might Occur while Cloning vSphere VMs

Source: The Orchestration Console
Explanation: Running the Clone action repeatedly on vSphere VM templates might result in the following error:
Clone : (503)Service Unavailable
Possible Cause: This error indicates that the server is currently unable to handle the request due to a temporary overloading or maintenance of the server. Testing has shown that this error is most likely to occur when vSphere and the Orchestration Agent are both installed on the same Windows Server 2003 computer.
Action: If you encounter this error, we recommend that you download and apply the appropriate Microsoft hotfix to the vCenter server.

Invalid Datastore Path Error

Source: The Orchestration Server.
Explanation: When attempting to Save Config a vSphere VM with an ISO-backed vDisk (for example, a vDisk that specifies a location in the /vmmimages folder and does not have its repository fact set), the job fails with a message similar to the following:
VMSaveConfig : Invalid datastore path '/vmimages/tools-isoimages/linux.iso'
Action: To work around this issue, associate a policy with the ISO-backed vdisk object that prepends an empty datastore string ([]) to the beginning of the vdisk.location fact. For example:
<policy>
  <vdisk>
    <fact name="location" 
          type="String" 
          value="[] /vmimages/tools-isoimages/linux.iso" />
  </vdisk>
</policy>

Running Provisioning Operations on a Batch of vSphere VMs Results in JDL Event Handler Errors

Source: The Orchestration Console
Explanation: If you write JDL scripts to automate provisioning actions for a large number of vSphere VMs, you might receive a failure notice similar to the following:
Job 'testadmin.r_testvm_resync_batch.15684' terminated because of failure.
Reason: job exceeded max limit of jdl event handler
Job 'testadmin.r_testvm_resync_batch.15684' terminated because of failure.
Reason: job exceeded max limit of jdl event handler

You also see the following error in server.log:

08.24 17:32:59: JobManager,NOTICE: job instance 'testadmin.r_testvm_resync_batch.15082' failed
08.24 17:46:25: JobManager,NOTICE: job instance 'testadmin.r_testvm_resync_batch.15684' failed
08.24 17:46:25: Broker,ERROR: Exception in thread "JDL Event (job_failed_event) jobId (testadmin.r_testvm_resync_batch.15684)" 
08.24 17:46:25: Broker,ERROR: ValueError: I/O operation on closed file
Possible Cause: This error indicates that maximum number of JDL threads allowed by the server have been exceeded. Testing has shown that numerous instances of the provisioner_completed_event are blocked and waiting for the provisioner job to finish its job_started_event.
Action: Rewrite the original script. The original script might look like this:
import time
class testvm_resync(Job):
    def job_started_event(self):
        vms_group = getMatrix().getGroup(TYPE_RESOURCE, 'VMs') # gets the matrix object id for 'VMs' group
        vms = vms_group.getMembers() # gets the group members of 'VMs' group
        for vm in vms:
            id = vm.getFact("resource.id") #gets the resource.id fact of a vm
            thevmtype = vm.getFact("resource.type") # find the vm type
            if id.startswith("c-") and thevmtype == 'VM':  # search criteria
                vmstate = vm.getFact("resource.provision.state") # find the vm state
                thevm = getMatrix().getGridObject(TYPE_RESOURCE, id); #gets the vm's id
                thevm.check() # vm life cycle operations
                time.sleep(2*60) #pause time - 2 min

The rewritten script might look like this:

import time
class testvm_resync(Job):
    def job_started_event(self):
        timer = Timer(self.prov,0)
    def prov(self):
        vms_group = getMatrix().getGroup(TYPE_RESOURCE, 'VMs') # gets the matrix object id for 'VMs' group
        vms = vms_group.getMembers() # gets the group members of 'VMs' group
        for vm in vms:
            id = vm.getFact("resource.id") #gets the resource.id fact of a vm
            thevmtype = vm.getFact("resource.type") # find the vm type
            if id.startswith("c-") and thevmtype == 'VM':  # search criteria
                vmstate = vm.getFact("resource.provision.state") # find the vm state
                thevm = getMatrix().getGridObject(TYPE_RESOURCE, id); #gets the vm's id
                thevm.check() # vm life cycle operations
                time.sleep(2*60) #pause time - 2 min

This change lets the job_started_event end after transferring the process to another JDL event/method to run on a timer basis. In this example, the timer is set for 10 seconds, but you could set it to zero.

The timer is normally used for callback. For example, the vSphere provisioning adapter uses Timer to check every 30 seconds whether a vSphere action is still working or dead.

This not isolated to the check() action. It includes other actions such as provision(), shutdown(), suspend(), checkpoint(), saveConfig() and restart().

Moving a VM Host in vSphere Results in Duplicate Repositories

Source: The Orchestration Server.
Explanation: If you move a VM host in your vSphere environment and then you subsequently perform a discovery in the Orchestration Console, the console displays duplicate repositories for the host that was moved.
Action: After you rediscover VM hosts and repositories in the Orchestration Console, you should delete the old repository grid object from the Explorer tree view in the Orchestration Console. Identify the repository to be deleted by checking the name of the datacenter, which is included in the repository.datacenter fact. If the value for this fact is the name of the old datacenter, this is the repository you want to delete.