Maximizing VM Performance and CPU Utilization

October 08, 2014 | Heroix Staff

In a different post we discussed memory management in VMware and the allocation of memory. Memory over allocation is when you provision your virtual machines with more memory than actually exists on the host machines. Memory over allocation works because the hypervisor assigns memory to virtual machines as needed rather than as provisioned. Do you have a server that needs 2 GB memory for 10 minutes each night and functions at .5 GB for the rest of the day? The hypervisor will run the VM with .5 GB of memory, increase it to 2 GB as needed for 10 minutes, and then reclaim the memory when it hasn’t been used for a while and is needed elsewhere.

The safest scenario is to plan for the case where all VMs are using their maximum memory allocation and only assign existing resources. However this leaves a lot of idle memory on the table that could be used for additional VMs. If you use that idle memory to provision additional VMs the (unlikely) worst case scenario would be if all the VMs spiked memory to 100% at the same time causing the hypervisor to start swapping and leading to severe performance degradation. The additional VMs you could create from memory over allocation aren’t worth the risk for a mission critical VM. However if you need to squeeze in a couple more web servers or virtual desktops then memory over provisioning is useful.

Just as with VM memory, CPU is usually highly underutilized and can be over allocated without compromising performance. As per the Performance Best Practices for VMware vSphere 5.5:

In most environments ESXi allows significant levels of CPU overcommitment (that is, running more vCPUs on a host than the total number of physical processor cores in that host) without impacting virtual machine performance.
(p. 20)


Without over allocation the total number of vCPUs is limited to the number of physical CPU cores (pCPU) on a host:

(# Processor Sockets) X (# Cores / Processor)  = # Physical Processors (pCPU)

If the physical processors use hyperthreading:

(# pCPU) X (2 logical processors / physical processor) = # Logical Processors

If you’ve got 2 processors with 6 cores each that would provide 12 pCPUs or 24 pCPUs with hyperthreading enabled. However hyperthreading works by providing a second execution thread to an existing core. When one thread is idle or waiting the other thread can execute instructions. This can increase efficiency if there is enough CPU Idle time to provide for scheduling two threads. However in practice performance increases are up to 30% rather than the 2x CPU suggested by the logical CPU count formula.

In addition to considering the effect of hyperthreading you will also need to consider the type of workloads being run by processors and whether you are using NUMA (Non-Uniform Memory Access) hardware. We’ll delve into the intricacies of tuning vCPUs, workloads, and host hardware in a later post. For now Best Practices for Oversubscription of CPU, Memory and Storage suggests starting with one vCPU per VM and increasing as needed and quotes recommendations for the maximum ratio of vCPUs to pCPU varying from 1.5 to 15.

The Best Practices paper lists several metrics to monitor in order to determine the best vCPU to pCPU ratio for your environment:

VM CPU Utilization: To determine if a VM requires additional vCPU resources.
Host CPU Utilization: To determine overall pCPU utilization.
CPU Ready: Measures the amount of time a VM has to wait for pCPU resources. VMware recommends this should be less than 5%.

Maximum CPU for both Host and VM is typically set at 80% but this value should be adjusted depending on your workload and hardware.

Want to learn more?

Download our Virtualization or Cloud IaaS Whitepaper - both technologies can provide redundancies that will maximize your uptime and that will allow you to squeeze out the most performance. Which is better and how do you decide?


Download the whitepaper:  Virtualization or Cloud IaaS?