CPU and memory are essential performance metrics, but their fundamentally different natures require metrics that exhibit a different set of attributes. As we discussed a few posts back, CPU is scheduled so it is monitored by looking at wait times and queue lengths, while memory is a fixed quantity and it is measured by looking for resource shortages and memory allocation problems for processes.
Before we look at the metrics, keep in mind that while more RAM is better, you can run into upper limits that are determined by the system hardware and Windows version. In general, 32-bit systems can use up to 4 GB RAM, while 64-bit systems can use significantly more. For example, 32-bit Windows 10 Professional is limited to 4 GB while the 64-bit version can use up to 2 TB. Server editions are currently only supported on 64-bit architectures and can address even more RAM, with Windows Server 2016 having a 24 TB upper limit.
Since RAM can range from a few GB to 24 TB there is no one threshold that can be used to determine when free memory is low. As we go through the metrics keep in mind that detecting performance problems will need to be based on evidence that indicates processes are not receiving memory when needed or that the system is hitting a limit and unable to grant more memory.
The volume of data generated that shows server performance and availability is quite substantial. Ultimately it is about how to make the best use of that data.
This whitepaper helps IT achieve successful long-term server monitoring by focusing on an approach that is lightweight, efficient, resilient, and automated.
Windows Memory Metrics: Windows Resource Monitor
The Windows Resource Monitor provides an overview of current memory use on the system:
Windows Resource Monitor Memory Display
In the image above, physical memory is divided into the following categories:
|Hardware Reserved||Reserved for hardware (e.g. Video, Ethernet adapters) and unavailable for user processes.|
|In Use||Total RAM in use by process working sets.|
|Modified||Memory pages that have been modified and need to be written to disk|
|Standby||Pages that have not been accessed recently and have been released from working set. They can be paged out to disk if space is needed but are cached in memory in case their process needs them again.|
|Free||RAM that is not being used|
The summary lines displayed at the bottom of the Resource Monitor groups memory in terms of availability for new processes:
|Memory Category||Comprised Of||Measures|
(Perfmon: Memory\ Available MBytes)
|Standby + Free||Available RAM to start new processes|
|Cached||Modified + Standby||RAM currently used as cache that can be freed if needed|
In Use + Modified + Standby + Free
|Total RAM available on system for user processes|
|Installed||Total + Hardware Reserved||RAM installed on system|
Note that, as with the Linux memory handler, both cached memory and free memory are available for new user processes.
Windows Memory Metrics: Perfmon and Task Manager
- Commit Limit
Perfmon: Memory\Commit Limit
This is the total amount of memory that can be used on the system, and is the sum of RAM and pagefile space. If pagefiles are set to automatically extend, and there is disk space for them to do so, then the commit limit can increase.
- Commit Size
Task Manager: Performance Tab\Commit Size per process
Each process has a Commit Size that is the sum of the memory in use for that process (physical memory + pagefile) and additional memory that is reserved by the memory manager for the process. While reserved memory is not currently in use, the memory manager ensures that it remains available. The commit size for a process can change over the life of the process.
- Committed Bytes
Perfmon: Memory\Committed Bytes
This is total amount of memory commit sizes for all the processes on a system. As with the process commit size, the committed bytes includes memory that is not yet allocated to a process, but is reserved for future use. If the committed bytes exceeds the commit limit on the system, the system will extend pagefile space if possible, or prevent additional processes from starting if not. If you cannot extend the pagefile, make sure:
Committed Bytes < Commit Limit
- Working Set
PerfMon: Process\Working set
The working set is the set of memory pages that a process has in RAM. The Windows memory manager monitors the working set and moves pages to standby memory if they are not actively being used. Note that the working set can include memory for shared files (e.g. a system dll) that is accessed by multiple processes. This shared memory is counted as a part of the working set for each process.
- Private Working Set
Perfmon: Process\Working Set - Private
This is the working set for a process not including shared memory. If the process ends, this is the memory that will be freed.
- Page Faults
Perfmon: Memory\Page Faults/sec
This is the number of times per second that a process needed to read in (“page in”) memory pages for its working set. There are two types of page faults:
- Soft faults:
In a soft fault, the memory page the process needs is already in RAM - for example, it may be in the working set for another process. The memory manager adds the existing memory page to the working set without the need for disk IO. Soft faults do not affect system performance.
- Hard faults:
A hard fault occurs when the memory manager needs to read in data from disk, either from the pagefile, or reading a file directly. Hard faults can be a symptom of a memory shortage: if there is not enough RAM for the processes running on a system, the memory manager may be paging out working set memory for one process in order to make room for another process’ working set.
For page faults, the counters do not differentiate between soft faults and hard faults, and some level of hard faults is to be expected, so develop a baseline and use that as a threshold. If a spike occurs in page faults, correlate it with disk IO (e.g. Logical Disk\Current Disk Queue Length, Logical Disk\Disk Reads/sec, Logical Disk\Disk Writes/sec).
Longitude Page Fault Rate Summary Report
- Soft faults:
- Paged Pool
Perfmon: Process\Pool Paged Bytes and Memory\Pool Paged Bytes
Perfmon: Process\Pool Nonpaged Bytes and Memory\Pool Nonpaged Bytes
As the name suggests, a paged pool is a pool of memory pages that can be paged out to disk, while a nonpaged pool are memory pages that cannot be paged out to disk and must remain in the working set. These counters should be monitored against a baseline and can be used as indicators of processes with memory leaks.
Except for virtualized servers, memory is a fixed resource, and it can be easy to run more processes than can be supported on a system. Monitor the following counters to ensure that your processes have the memory they require:
|Committed Bytes > Commit Limit||Too much memory is in use on the system. Ensure that the page file can be extended or stop processes if it cannot.|
|Memory\Available MBytes||Less than baseline||Baseline the memory needed for known processes. If available memory is less than baseline value, check for unexpected processes.|
|Process\Working Set - Private||Deviates from baseline||Memory deviations from baseline can indicate a process is not functioning normally.|
\Current Disk Queue Length
|Deviates from baseline||High page faults can indicate a memory shortage. Correlate with disk IO metrics to check for hard page faults vs soft page faults.|
|Process\Pool Nonpaged Bytes||Deviates from baseline||Higher than usual nonpaged pool bytes for a process can indicate a memory problem.|
Want to learn more?