VMware provides the ability to create virtual machines (VMs) that are provisioned with more memory than physically exists on their host servers. This is possible because VMware memory management is able to recover memory that is no longer in use by the VM’s guest operating system (OS). However, you can push memory overcommitment to the point where the hypervisor is unable to keep up, leading to potentially severe performance degradation on the VMs. With proper capacity planning you can estimate how much overcommitment is possible before risking performance problems.
Today’s blog will outline basic VMware memory terms, how and when VMware memory management initiates memory reclamation, and capacity planning best practices.
VMware memory terms
The following concepts are used in discussing memory on ESX hosts:
- Capacity (Host level)The physical memory available on the host. (see Fig. 1)
- Consumed Memory (Host level)
Total memory in use on the ESX host, which includes memory used by the Service Console, the VMKernel, vSphere services, plus the memory in use for all running VMs.
- Provisioned Memory (VM level)
The amount of memory allocated to a VM plus the overhead needed to manage the VM. Provisioned memory is an upper limit – when a VM is powered on, it will only consume the memory that the OS requests, and the hypervisor will continue to grant additional memory requests made by the VM until the provisioned memory limit is reached.
- Consumed Memory (VM level)
Current level of memory consumption for a specified VM. (See Fig. 2)
- Active Guest Memory (VM level)
Hypervisor estimate of memory actively being used in the VM’s guest OS. The hypervisor does not communicate with the guest OS, so it does not know if any memory allocated to the VM is no longer needed. To gauge memory activity the hypervisor checks a random sample of the VM’s allocated memory and calculates the percent of the sample that is actively being accessed during the sampling period.Figure 2: vSphere display of a VM’s Consumed Host and Active Guest Memory
A Host level minimum free memory threshold that is used to trigger memory reclamation. VMware initiates increasingly aggressive memory reclamation techniques as the free memory decreases further below the mem.minFree value. The mem.minFree value is calculated based on a sliding scale, in which 899 MB of memory is reserved for the first 28GB of physical memory, and 1% of memory is reserved for physical memory beyond 28GB. For example, for a 64 GB server:mem.minFree = 899 M + (64GB – 28GB)*.01 = 899MB + 369MB = 1268 MB
VMware memory management: reclamation
The following are VMware memory reclamation techniques, in order of severity:
- Transparent Page Sharing (TPS) - VMware detects and de-duplicates identical memory pages. TPS begins by breaking up large memory pages (2MB) into smaller pages (4KB), and checks the smaller pages for duplicates. TPS is “transparent” to the VM’s guest operating system as it does not affect the amount of memory consumed by the VM. TPS is enabled within VMs, however in VMware 6.0 it is disabled by default between VMs for security considerations.
- Ballooning is a technique in which the hypervisor reclaims idle memory from a guest OS and returns it to the host. Ballooning works as follows:
- The hypervisor contacts a balloon driver installed on the guest OS as part of VMware tools.
- The hypervisor tells the balloon driver to request memory for a balloon process from the guest OS.
- The guest OS allocates memory to the balloon process. That memory is now unavailable for other processes on the guest OS.
- The balloon driver contacts the hypervisor with the details of the memory it has been allocated.
- The hypervisor removes the ballooned memory from the VM, lowering memory consumed by that VM.
- If memory problems are resolved on the host, memory can be returned to the VMs by “deflating” the memory used by the balloon and re-allocating the memory to the VM.
From the guest OS perspective, the total memory has not been changed, but available memory has effectively decreased by the amount in use by the balloon process. Guest OS performance can be significantly degraded if the ballooning process reduces memory to the point where the guest OS needs to start paging.
- Memory compression – the hypervisor looks for memory pages that it can compress and reduce by at least 50%.
- Swapping – the hypervisor swaps memory pages to disk
|mem.minFree||State||Memory Reclamation Techniques|
|400%||High||Break down large memory pages|
|100%||Clear||Break down large memory pages + TPS|
|64%||Soft||TPS + Ballooning|
|32%||Hard||TPS + Memory Compression + Swapping|
|16%||Low||Memory Compression + Swapping
+ Block VMs from allocating memory
- TPS does not affect VM performance.
- Monitor a VM’s guest OS for paging due to low memory during ballooning.
- Memory compression and swapping can cause serious performance problems for VM performance
- TPS and ballooning are relatively slow compared to swapping, if you need memory fast, swapping may be used.
Capacity Planning Best Practices for Memory
The goal of VMware is to maximize memory use without starving your VMs of the memory they need to perform. To estimate memory requirements for capacity planning you need to look at both the Active Guest Memory metrics from VMware and the Memory use metrics from the operating system.
Figures 3 and 4 show memory use on an Exchange 2013 server running within a VM – Figure 3 is memory use from the Windows perspective (~7.5Gb), and Figure 5 is active memory use from the VM perspective (~1.5GB). VMware Active Guest Memory underestimates the memory needed for Exchange. The best practice for underestimated Active Guest Memory is to allocate memory for the VM as specified by the application’s requirements, and to make sure that the VM would not lose memory to ballooning either by running the VM on a host without memory overcommitment, or by setting a memory reservation for the VM.
Best practices for Capacity Planning for memory are:
- Check memory use reports for the VM’s guest OS to gauge memory requirements, and defer to application recommendations for memory allocation.
- Do not use more memory than is needed for your VM. Ideally, consumed memory for the VM should be close to the memory used by the guest OS, plus overhead for running the VM.
- If the Active Guest Memory estimate is significantly smaller than actual guest OS Memory use, protect the VM from memory reclamation by either setting a memory reservation or running the VM on a host that has not been overcommitted.
- If you overcommit memory, monitor Host Consumed memory. When Host Consumed is close to capacity, watch for signs of the Host Consumed value dropping (indicating TPS recovering memory), or unexpectedly low free memory on the guest OS’s (indicating ballooning).
- While ballooning is occurring, monitor free memory and paging on guest OS’s. Move VMs to hosts with more memory if ballooning causes performance degradation on the VMs.
- If ballooning does not resolve low memory on the host, move or power down VMs before reaching the Hard memory state (32% of mem.minFree). Memory compression and swapping cause sever performance degradation.
Overcommitting memory can make the best use of your resources, but keep an eye on the host’s consumed memory and the effect of memory reclamation on your VMs. If you have VMs running memory sensitive applications, make sure you allocate enough memory for them and protect them from memory reclamation.