VMware provides the ability to create virtual machines (VMs) that are provisioned with more memory than physically exists on their host servers. This is possible because VMware memory management is able to recover memory that is no longer in use by the VM’s guest operating system (OS).
However, you can push memory overcommitment to the point where the hypervisor is unable to keep up, leading to potentially severe performance degradation on the VMs.
With proper capacity planning you can estimate how much overcommitment is possible before risking performance problems.
Our Overcommitting VMware Resources Whitepaper delivers the guidelines you need to ensure that you are properly allocating your host resources without sacrificing performance.
Today’s blog will outline basic VMware memory terms, how and when VMware memory management initiates memory reclamation, and capacity planning best practices.
► VMware Memory Terms
- Capacity (Host level)
The physical memory available on the host. (see Fig. 1)
- Consumed Memory (Host level)
Total memory in use on the ESX host, which includes memory used by the Service Console, the VMKernel, vSphere services, plus the memory in use for all running VMs.
Figure 1: vCenter with 37GB consumed host memory out of 64 GB physical capacity
- Provisioned Memory (VM level)
The amount of memory allocated to a VM plus the overhead needed to manage the VM. Provisioned memory is an upper limit – when a VM is powered on, it will only consume the memory that the OS requests, and the hypervisor will continue to grant additional memory requests made by the VM until the provisioned memory limit is reached.
- Consumed Memory (VM level)
Current level of memory consumption for a specified VM. (See Fig. 2)
- Active Guest Memory (VM level)
Hypervisor estimate of memory actively being used in the VM’s guest OS. The hypervisor does not communicate with the guest OS, so it does not know if any memory allocated to the VM is no longer needed. To gauge memory activity the hypervisor checks a random sample of the VM’s allocated memory and calculates the percent of the sample that is actively being accessed during the sampling period.
Figure 2: vSphere display of a VM’s Consumed Host and Active Guest Memory
A Host level minimum free memory threshold that is used to trigger memory reclamation. VMware initiates increasingly aggressive memory reclamation techniques as the free memory decreases further below the mem.minFree value. The mem.minFree value is calculated based on a sliding scale, in which 899 MB of memory is reserved for the first 28GB of physical memory, and 1% of memory is reserved for physical memory beyond 28GB. For example, for a 64 GB server:mem.minFree = 899 M + (64GB – 28GB)*.01 = 899MB + 369MB = 1268 MB
► VMware Memory Reclamation Techniques
The following are VMware memory reclamation techniques, in order of severity:
- Transparent Page Sharing (TPS)
VMware detects and de-duplicates identical memory pages. TPS begins by breaking up large memory pages (2MB) into smaller pages (4KB), and checks the smaller pages for duplicates. TPS is “transparent” to the VM’s guest operating system as it does not affect the amount of memory consumed by the VM. TPS is enabled within VMs, however in VMware 6.0 it is disabled by default between VMs for security considerations.
A technique in which the hypervisor reclaims idle memory from a guest OS and returns it to the host.
Ballooning works as follows:
- The hypervisor contacts a balloon driver installed on the guest OS as part of VMware tools.
- The hypervisor tells the balloon driver to request memory for a balloon process from the guest OS.
- The guest OS allocates memory to the balloon process. That memory is now unavailable for other processes on the guest OS.
- The balloon driver contacts the hypervisor with the details of the memory it has been allocated.
- The hypervisor removes the ballooned memory from the VM, lowering memory consumed by that VM.
- If memory problems are resolved on the host, memory can be returned to the VMs by “deflating” the memory used by the balloon and re-allocating the memory to the VM.
From the guest OS perspective, the total memory has not been changed, but available memory has effectively decreased by the amount in use by the balloon process. Guest OS performance can be significantly degraded if the ballooning process reduces memory to the point where the guest OS needs to start paging.
- Memory compression
The hypervisor looks for memory pages that it can compress and reduce by at least 50%.
The hypervisor swaps memory pages to disk
|mem.minFree||State||Memory Reclamation Techniques|
|400%||High||Break down large memory pages|
|100%||Clear||Break down large memory pages + TPS|
|64%||Soft||TPS + Ballooning|
|32%||Hard||TPS + Memory Compression + Swapping|
|16%||Low||Memory Compression + Swapping
+ Block VMs from allocating memory
Key VMware Memory Reclamation Points
► Capacity Planning Best Practices for Memory
The goal of VMware is to maximize memory use without starving your VMs of the memory they need to perform. To estimate memory requirements for capacity planning you need to look at both the Active Guest Memory metrics from VMware and the Memory use metrics from the operating system.
Figures 3 and 4 show memory use on an Exchange server running within a VM – Figure 3 is memory use from the Windows perspective (~7.5Gb), and Figure 4 is active memory use from the VM perspective (~1.5GB). VMware Active Guest Memory underestimates the memory needed for Exchange. The best practice for underestimated Active Guest Memory is to allocate memory for the VM as specified by the application’s requirements, and to make sure that the VM would not lose memory to ballooning either by running the VM on a host without memory overcommitment, or by setting a memory reservation for the VM.
Best Practices for Capacity Planning for Memory
Overcommitting memory can make the best use of your resources, but keep an eye on the host’s consumed memory and the effect of memory reclamation on your VMs. If you have VMs running memory sensitive applications, make sure you allocate enough memory for them and protect them from memory reclamation.