August 22, 2013 | Susan Bilder

One of the basic principles in performance tuning is that server performance – physical or virtual – is governed by the slowest component on the server.  And, in general, if CPU, memory and network components have been adequately provisioned, the slowest component will be disk I/O.

In VMware, storage latency for guest VMs is a combination of kernel latency (the time the I/O request needs to make it through the hypervisor), and the latency from the storage device and hardware associated with the storage device.  The relationship is:


Where GAVG is the guest average latency, KAVG is the kernel average latency, and DAVG is the device average latency.  The Yellow Bricks web site discusses using the esxtop command in the ESX shell to observe these values, and provides a suggested threshold for DAVG of 25 ms.  However, VMware recommends keeping DAVG consistently between 10-15 ms in order to meet most SLA requirements.

The KAVG threshold is generally recommended to be <= 2ms, and if the value is significantly higher than this, the problem could be due to queuing in the kernel (QAVG) – which should be =  0 according to VMware’s vSphere 5 documentation.

The VMware guide also includes recommendations on addressing disk I/O performance problems, but the suggestions are more useful for KAVG and QAVG issues, since those are specific to the VMware hypervisor.  Device latency issues are best addressed by referring to vendor documentation and tools.

In practice, the exact GAVG threshold for an application on a guest VM to function satisfactorily will depend on the sensitivity of the application to I/O latency, and on the contention with other VMs for disk I/O.  Creating baselines for performance across multiple VMs and correlating those baselines with VMware latency values can help you identify I/O contention problems, and plan out a more efficient distribution for VMs over storage devices and hosts.

