When considering the number of IT metrics available on the Windows platforms it is daunting to say the least. What metrics do you go after? How do you go after them? If you are a do it yourself-er and want to set out on your own before considering commercial technology there are a number of methods to collect, archive, and alert on IT Metrics.
Before delving into the various methods, it is important to get a fundamental understanding of Windows Management Infrastructure (MI) and its predecessor Windows Management Instrumentation (WMI). For our purposes, today, in its most basic form you can view them as a construct that provides a common methodology to define and access performance metrics related to systems and applications. MI and WMI are based on the Common Information Model (CIM) maintained by the Distributed Management Task Force. It is a bit of an over simplification, but you can look at WMI as a database of sorts.
WMI can be used to locally or remotely query information about the servers. Technologies that “agentlessly” monitor Windows platforms are most likely leveraging WMI. It is quite efficient and secure in its own right.
The first phase of monitoring IT operations is the collection phase. Basically, what we want to do is simply gather the data so that we can alert or later report on IT metrics related to the servers, network, and applications.
IT operations metrics related to performance and health can be easily collected and evaluated with PowerShell. PowerShell is really a godsend as life before PowerShell (pre-Windows 2008) was Visual Basic and bat files! The PowerShell command line interface is quite powerful and provides access to most all windows metrics for both local and remote servers. There are 2 PowerShell commandlets (cmdlets) Get-WMIObject (the "original" cmdlet) and Get-Counter
To access WMI objects, you can use the Get-WMIObject cmdlet with the WMI object to be queried as a parameter.
Get-WMIObject Win32_PerfformattedData_PerfOS_Processor -computername Longitude-Demo | select Name,PercentProcessorTime
The PowerShell command above is collecting CPU processor time from the Win32_PerfformattedData_PerfOS_Processor WMI Class from a remote server called Longitude-Demo. We can see in the screenshot below that 26% across 2 CPU's is the usage.
To access WMI objects by referencing the counter directly, use the Get-Counter cmdlet with the syntax of Get-Counter -Counter <Counter Class>\<Property Name>
Get-Counter -counter '\Processor(_Total)\% Processor Time' -computer Longitude-Demo
The PowerShell command above is collecting processor utilization from a remote server called Longitude-Demo. We can see in the screenshot below the value of 39% is returned
If you aren’t sure what counters are available to you then you’ll find the -ListSet cmdlet to be very invaluable:
Get-Counter – ListSet *
The above command will list of all the counters available to you on the local server or a remote server with the -computer parameter. The list can be pretty expansive, so you'll want send the output to a file for closer review.
To narrow down the list of counters simply specifying the CounterSetName that you want to go target after reviewing ListSet *
Get-Counter -ListSet Processor -computer Longitude-Demo
provides detailed information
Get-Counter -ListSet Processor -computer Longitude-Demo | Select-Object -ExpandProperty Counter
provides summary information
Here we targeted a remote server Longitude-Demo and identified the Processor set of IT metrics as the ones we wanted to go after and specifically the "%Processor time" counter
A sizable library of PowerShell scripts can be found on Microsoft Technet’s Script Center. Here is a sample PowerShell Script that uses Get-Counter to monitor CPU utilization and notify if a threshold is breached.
When considering what cmdlet to use both Get-Counter and Get-WMIObject cmdlets provide access to the most common WMI constructs. There may be instances where the generic nature of the Get-Counter cmdlet won’t give you access to exactly what you require and you’ll need to revert to the Get-WmiObject.
If you are looking to export performance data then Get-Counter along with Export-Counter cmdlet is definitely worth a closer look. The Export-Counter cmdlet exports to log formats including the Windows Performance Monitor’s binary performance log (.blg), comma-separated value (.csv), or tab-separated value (.tsv).
Table of Helpful WMI Classes and returned IT Metrics
The WMI classes below deliver the IT Metrics that serve as a foundation for monitoring the Windows infrastructure.
|WMI Class||Category||Useful IT Metrics|
|Win32_PerfRawData_PerfOS_System||CPU Metrics||Thread Count, Process Count, CPU Queue Length|
|Win32_PerfRawData_PerfOS_Processor||CPU Metrics||% Interrupt Time, % Privileged Time, % Processor Time, %User Time|
|Win32_PerfFormattedData_PerfDisk_LogicalDisk||Disk||Disk Space Used%, Disk Queue Length, Disk Reads/sec, Disk Writes/sec, Disk Read Bytes/sec, Disk Write Bytes/sec, Disk Transfes/sec|
|Win32_DiskDrive||Disk||Disk Space Capacity|
|Win32_PerfRawData_Tcpip_NetworkInterface||Network||Output Queue Length, Packet Inbound/Outbound Errors, Kilobytes Sent/sec, Kiloybytes Received/sec|
|Win32_Printer||Printer||Printer availability and status|
|Win32_PerfRawData_PerfProc_Process||Process||Process name, ID, Elapsed time, % Privileged Time, %Processor Time, Page File Bytes, Thread Count|
|Win32_Service||Service||Service Exit Code, Started, Start Mode, Start Name, State|
|Win32_NTLogEvent||Event Log||Windows Eventlog Events|
typeperf.exe has been around for some time and is a powerful command utility that writes performance data to logs, sends to the command window, and can be scripted to write to SQLServer. If you like to script and need performance data typeperf is a great option.
typeperf "\Processor(_Total)\% Processor Time" -s Longitude-Demo
Here we targeted a remote server Longitude-Demo for the "%Processor time" counter
The “go to” program for many Windows performance problems is Perfmon. The Perfmon interface allows you to see what is happening in real time and also comes with excellent graphics as well as well as providing the ability to report historical information. (remember you can also use Get-Counter cmdlet to feed data to Perfmon as well)
Here we see Perfmon graphing processor time for a server “Longitude-Demo”
Perfmon can also be configured to alert via a command (i.e. MSG ) or with a script.
For example: to alert on a Processor Usage.
WMI is also readily accessible via VBScript and other methods such as Perl. There is a bit more “plumbing” to contend with as the PowerShell cmdlets abstract out some of the complexity. If you’re going to be collecting a large number of metrics or are concerned about overhead, then coding may be the way to go. For example, VBScript uses slightly less resources than PowerShell, but you’ll need allocate quite a bit more time for the coding over PowerShell. If you’re going to collect a large volume of metrics, you’ll want make sure your VBscript or PowerShell for that matter operate as multi-threads as this is the only way you’ll be able to collect and evaluate the data on a timely basis.
As it applies to collecting metrics related to IT operations the challenge isn’t so much the mechanics of collecting the actual data itself, but what you do with the data once you have it.
- Is the data going to be archived for reporting?
- What kind of alerting/thresholds/escalation are going to be embed in the script?.
If you’re looking to archive the data for reporting, you’ll need to give consideration to the format (i.e. .csv or writing to a SQL database), the volume, and data retention. This will impact not only the method you use to collect the data, but how often you collect the data.
Alerting is straightforward when there are a limited number of metrics and thresholds. When scripting the alerts, special consideration will need to be given to the alerting criteria itself. As the number of servers and applications being monitored expands “one size fits” all thresholds can present challenges. Too many false alerts and you’ll get into a bit of “Boy who cried wolf”, where the problem volume is so high that it is difficult to separate serious problems from the fodder. Some upfront planning in terms of not only determining what metrics to target but what the most appropriate thresholds for alerting are can go a long way towards a successful IT monitoring effort.
Want to learn more?
Download our Best Practices for Server Monitoring Whitepaper and learn how to achieve a successful long-term server monitoring strategy by focusing on an approach that is lightweight, efficient, resilient, and automated.