Blog

VMware Distributed Resource Scheduler

June 22, 2017 | Susan Bilder

As we’ve discussed in previous posts, VMware can ensure uptime in the event of a hardware failure via its High Availability (HA) feature. HA detects when a VMware ESX host fails, and uses vMotion to move the virtual machines (VMs) on the failed host to an available host in the cluster. However, HA does not monitor VM performance and will not move VMs to a different host if they would perform better on that host. That’s where VMware’s Distributed Resource Scheduler (DRS) comes in.

DRS monitors the memory and CPU demand for VMs and calculates the optimal distribution to load balance VMs on available host servers, both for their initial placement on a host and later by recommendations to vMotion a VM to a different host. DRS also ensures that configuration settings for resource pools and VMs are enforced as part of its load balancing algorithm.

This post will provide an overview of the settings that DRS enforces, and an outline of how DRS estimates VM and host resource demand for load balancing. The outline will provide the basic methodology used by DRS, but the complete details of the complex algorithm that DRS uses for load balancing are beyond the scope of today’s post.

Resource Allocation Settings

When you create a VM, you specify the maximum CPU and memory the VM can access. However, on startup the VM is granted only the resources it requires rather than full amount it has been allocated. Setting a resource Reservation, for either CPU or memory, will cause the host to set aside the specified CPU and/or memory when the VM starts, with the reserved amount available for the exclusive use of the VM.

While a reservation may sound like a great way to make sure that your high-priority VMs don’t need to contend for resources, the downside is that if the VM needs to be moved to a different host, that can only be done if there is enough CPU or memory for the VM’s reservation. This can interfere with the ability of HA or DRS to vMotion the VM to a new host.

Where a reservation is a guaranteed minimum of resources, a Limit is the maximum. A limit is useful if you want to restrict the resources available to VMs - regardless of the number of vCPUs or the amount of memory allocated to a VM, the VM will only be able to access CPU up to the CPU limit, and memory up to the memory limit. That being said, setting limits that are below the number of vCPUs or memory allocated to a VM can cause significant performance issues on the VM’s operating system.

The screenshot below shows the vSphere web UI settings page for CPU resource allocation settings for VM DC01, with the default CPU reservation of 0, and CPU limit of “unlimited”.
Resource_allocation2.png

Fig. 1: vSphere web client edit screen for VM DC01 resource allocation.

Resource Pool Architecture

A VMware resource pool groups together all the host resources and makes them available for management as an aggregate across hosts rather than as per host resources. By default, resources are grouped into a single root resource pool, which can be subdivided into smaller pools with VMs assigned to these smaller pools limited to the pool’s resources.

For example, if you have 4 ESX servers, each of which have 64 GB RAM, and 16 vCPUs, and you want to provide more resources to production than to development, you could subdivide your root resource pool (256 GB RAM, 64 vCPUs) into a production pool and a development pool:

resource_pool_crop.png

Fig. 2: Resource Pool Architecture

The child resource pools do not have specific amounts of memory or CPU allocated to them. The child pools receive resources from their parent pool based on requests from the VMs in the child pool. As long as the parent pool has resources available, the requests from the VMs in the child pools will be granted.

However, if the parent pool runs out of resources, it needs a way to prioritize resource distribution between the child pools. In order to do this, the child resource pools are assigned shares, with higher priority pools receiving more shares than lower priority ones. Resources are then allocated to the child pools based on their proportion of shares to the total number of shares.

Shares

Shares are used to configure the relative importance of both VMs and resource pools so that VMware can preferentially direct resources to higher priority resource pools or VMs when there is no free memory or CPU. The parent pool distributes resources among its child pools based on the number of CPU or memory shares the child pool has relative to the parent pool’s other child pools.

Shares are assigned to resource pools and VMs by setting a share level (see Fig. 1). Share levels can be set to Custom, Low, Normal (default), or High. High has twice the shares that Normal has, and Normal has twice the shares Low has. When setting a custom value, check that it is similar to the shares assigned for CPU or memory for other child pools at the same level, or for other VMs in the same child pool.

To calculate the shares for a resource pool, VMware’s Resource Management Guide uses the following guide:

Share level CPU Memory
High 2000 shares/vCPU 20 shares/MB
Normal 1000 shares/vCPU 10 shares/MB
Low 500 shares/vCPU 5 shares/MB

Resource pools have shares calculated for them as if they were VMs with 4 vCPUs and 16 GB RAM. The number of shares per child pool based on share level would be:

Share Level vCPU shares Memory shares
High 8000 327,680
Normal 4000 163,840
Low 2000 81,920

The total number of shares is not significant - what matters is the proportion of shares each child pool has relative to the total. For example, in our example above, if both Development and Production had Normal share levels:

Pool vCPU shares vCPU allocated Memory shares Memory Allocated
Development 4000 (Normal) 50% 163,840 (Normal) 50%
Production 4000 (Normal) 50% 163,840 (Normal) 50%

In this case Development and Production each have half of the total vCPU shares and half of the total Memory shares. If the root resource pool runs out of memory or CPU, resources will be distributed evenly between the resource pools.

If the Development pool were set to low shares, both total number of shares and Development’s portion of the shares would drop, providing development with 1/3 of the total resources:

Pool vCPU shares vCPU allocated Memory shares Memory Allocated
Development 2000 (Low) 33% 81,920 (Low) 33%
Production 4000 (Normal) 67% 163,840 (Normal) 67%

Resources would be cut even further if Development was set to Low and Production was set to High for vCPU, bringing Development down to 20% of vCPU resources. However, setting Development’s memory share level to “Normal” doubles its number of memory shares, and brings the memory it can access back to 1/3 of the root resource pool:

Pool vCPU shares vCPU allocated Memory shares Memory Allocated
Development 2000 (Low) 20% 163,840 (Normal) 33%
Production 8000 (High) 80% 327,860 (High) 67%

The allocation of resources for VMs within a resource pool works in the same way as it does between pools. For example, for the following VMs running in the same resource pool:

VM #vCPU #vCPU shares vCPU allocated Memory #Memory shares Memory allocated
VM1 2 2000 (Normal) 20% 2 Gb 10,240 (Low) 10%
VM2 2 2000 (Normal) 20% 2 Gb 20,480 (Low) 18%
VM3 4 4000 (Normal) 40% 4 Gb 40,960 (Normal) 36%
VM4 2 2000 (Normal) 20% 2 Gb 40,960 (High) 36%

When VMware manages the resource pool, for memory, VM3 and VM4 would be prioritized, VM2 would be next, and VM1 would have the lowest priority. For CPU, all the VMs have Normal priority, but VM3 would receive more resources because it has twice as many vCPUs and therefore twice as many shares.

DRS Settings

  • Automation Level

    The automation level determines what DRS does when it has a vMotion load balancing recommendation, or a recommendation for a host for a VM’s initial placement. The settings are:
    Manual DRS suggests initial placement and load balancing, but does not apply recommendations.
    Partially Automated DRS applies initial placement recommendation, and recommends but does not apply load balancing.
    Fully Automated DRS applies both initial placement and load balancing recommendations.
  • Affinity

    Sometimes you want two or more VMs to run on the same host (e.g. a database backend and web server), and sometimes you want them to run on different hosts (e.g. domain controllers). Creating an affinity rule links VMs together on the same host while an anti-affinity rule requires that the VMs be on different hosts. You can also create rules that tie VMs to specific hosts.
  • Aggression Level

    The DRS aggression level sets a load imbalance threshold for when DRS will make a suggestion to vMotion a VM. Levels run from 1 to level 5, where level 1 does not load balance VMs at all, but will enforce existing configuration rules (affinity, reservations, etc.), and level 5 will be the most aggressive at using vMotion to load balance. The default aggression level setting is 3.

How Does DRS Work?

DRS suggests initial placement of VMs on hosts, and moves of existing VMs in order to load balance clusters of hosts. For initial host placement, the primary factors are the constraints on the VM:

  • The host must have enough available resources for any resource reservations.
  • If the VM has is configured to run on a specific host, it should be on that host.
  • The VM should be on the same host that is running VMs for which it has an affinity setting.
  • The VM should not be on the same host with VMs with an anti-affinity setting.

For load balancing, DRS monitors the CPU and memory demand on the hosts and VMs in the DRS cluster every 5 minutes and does the following:

  1. DRS calculates the current CPU and memory demands for each VM:

    A VM’s CPU demand is calculated as the sum of CPU and a portion of the VM’s CPU ready:
    CPUdemand = CPUused + CPUrun * CPUready / (CPUrun + CPUsleep)

    For memory:
    Memorydemand = Memoryactive + .25 * (Memoryidle,consumed)

    If the VM has a reservation, this is used as a lower bound for the demand, and if it has a limit, this is used as an upper bound.
  2. DRS calculates the resources to which each VM is entitled (Ei). This calculation looks at the VM’s reservation, limit, shares, and the resources available to their resource pool.

    Note that the entitlement Ei can be higher than the resource demand calculated for the VM in step #1, but it is still bound by the reservation minimum and limit maximum. The entitlement denotes the resources that a VM is configured to access, and does not indicate that the VM is actually currently using those resources.
  3. For each host, DRS calculates the resource entitlement ratio (Nh) as the sum of all the VM entitlements (Ei) divided by the capacity on the host.

    For example, given a host with 32 GB RAM, running 10 VMs, each of which have memory Ei of 4 GB RAM, the value for Nh would be (10*4)/32 = 1.25, indicating the host does not have the resources to meet the memory entitlements.

    If 3 VMs were moved from the server, the ratio would be (7*4)/32 = 0.875, which would allow the host to meet entitlements if all VMs requested their maximum allowed memory at the same time.
  4. The imbalance in the cluster (Ic) is calculated as the standard deviation over all the hosts for the entitlement ratio Nh. The imbalance value considers both memory and CPU equally, but if one of the resources is highly contended, the equation will be weighted toward that component with a 3:1 ratio.
  5. DRS runs through scenarios in which VMs are vMotioned to different hosts, looking for scenarios that minimize the Ic value.
  6. In addition to minimizing the cluster imbalance, DRS considers the following:
    • vMotion requires host resources, and for large VMs or busy hosts, the resources required for vMotion to correct a cluster imbalance may be more disruptive than the imbalance.
    • Previous recommendations or vMotions currently being executed. DRS will not overwrite a pending recommendation.
    • Affinity, Anti-Affinity, and Host Affinity rules.
  7. If DRS is set to Fully Automated, it will implement its recommendation. If not, it will make recommendations that the VMware administrator can approve.

Conclusion:

As VMware configurations become more complex, DRS can simplify management by automating load balancing on clusters and enforcing VM constraints. However, DRS is built on top of resource pools, resource allocation settings, and affinity rules, and these need to be configured carefully to ensure that VMs have the resources they require.

Want to learn more?

Download our Virtualization or Cloud IaaS Whitepaper - both technologies can provide redundancies that will maximize your uptime and that will allow you to squeeze out the most performance. Which is better and how do you decide?

Download the whitepaper:  Virtualization or Cloud IaaS?