Blog

5 Methods for Collecting Windows IT Metrics for Operations

April 04, 2017 | Ken Leoni

 

When considering the number of IT metrics available on the Windows platforms it is daunting to say the least. What metrics do you go after? How do you go after them? If you are a do it yourself-er and want to set out on your own before considering commercial technology there are a number of methods to collect, archive, and alert on IT Metrics.


Before delving into the various methods, it is important to get a fundamental understanding of Windows Management Infrastructure (MI) and its predecessor Windows Management Instrumentation (WMI).  For our purposes, today, in its most basic form you can view them as a construct that provides a common methodology to define and access performance metrics related to systems and applications.  MI and WMI are based on the Common Information Model (CIM) maintained by the Distributed Management Task Force.  It is a bit of an over simplification, but you can look at WMI as a database of sorts.

WMI can be used to locally or remotely query information about the servers.  Technologies that “agentlessly” monitor Windows platforms are most likely leveraging WMI.  It is quite efficient and secure in its own right.

The first phase of monitoring IT operations is the collection phase. Basically, what we want to do is simply gather the data so that we can alert or later report on IT metrics related to the servers, network, and applications.  

IT operations metrics related to performance and health can be easily collected and evaluated with PowerShell. PowerShell is really a godsend as life before PowerShell (pre-Windows 2008) was Visual Basic and bat files! The PowerShell command line interface is quite powerful and provides access to most all windows metrics for both local and remote servers.  There are  2 PowerShell commandlets (cmdlets) Get-WMIObject (the "original" cmdlet) and Get-Counter

PowerShell Get-WMIObject

To access WMI objects, you can use the Get-WMIObject cmdlet with the WMI object to be queried as a parameter.

For example:

Get-WMIObject Win32_PerfformattedData_PerfOS_Processor -computername Longitude-Demo | select Name,PercentProcessorTime

The PowerShell command above is collecting CPU processor time from the Win32_PerfformattedData_PerfOS_Processor WMI Class from a remote server called Longitude-Demo. We can see in the screenshot below that 26% across 2 CPU's is the usage.

Collecting Windows IT Metrics with PowerShell Get-WMIObject

PowerShell Get-Counter

To access WMI objects by referencing the counter directly, use the Get-Counter cmdlet with the syntax of Get-Counter -Counter <Counter Class>\<Property Name> 

For example:

Get-Counter -counter '\Processor(_Total)\% Processor Time' -computer Longitude-Demo

The PowerShell command above is collecting processor utilization from a remote server called Longitude-Demo. We can see in the screenshot below the value of 39% is returned

Collecting Windows IT Metrics with PowerShell Get-Counter

If you aren’t sure what counters are available to you then you’ll find the -ListSet cmdlet to be very invaluable:

Get-Counter – ListSet *

The above command will list of all the counters available to you on the local server or a remote server with the -computer parameter.  The list can be pretty expansive, so you'll want send the output to a file for closer review.

To narrow down the list of counters simply specifying the CounterSetName that you want to go target after reviewing ListSet *

For example:

Get-Counter -ListSet  Processor -computer Longitude-Demo

provides detailed information

Get-Counter -ListSet  Processor -computer Longitude-Demo | Select-Object -ExpandProperty Counter

provides summary information

Here we targeted a remote server Longitude-Demo and identified the Processor set of IT metrics as the ones we wanted to go after and specifically the "%Processor time" counter

Collecting Windows IT Metrics with PowerShell Get-Counter and -ListSet

A sizable library of PowerShell scripts can be found on Microsoft Technet’s Script Center. Here is a sample PowerShell Script that uses Get-Counter to monitor CPU utilization and notify if a threshold is breached. 

When considering what cmdlet to use both Get-Counter and Get-WMIObject cmdlets provide access to the most common WMI constructs. There may be instances where the generic nature of the Get-Counter cmdlet won’t give you access to exactly what you require and you’ll need to revert to the Get-WmiObject. 

If you are looking to export performance data then Get-Counter along with Export-Counter cmdlet is definitely worth a closer look.  The Export-Counter cmdlet exports to log formats including the Windows Performance Monitor’s binary performance log (.blg), comma-separated value (.csv), or tab-separated value (.tsv).

Table of Helpful WMI Classes and returned IT Metrics

The WMI classes below deliver the IT Metrics that serve as a foundation for monitoring the Windows infrastructure.

WMI Class

Category

Useful IT Metrics

Win32_PerfRawData_PerfOS_System

CPU Metrics

Thread Count, Process Count, CPU Queue Length

 Win32_PerfRawData_PerfOS_Processor

CPU Metrics

% Interrupt Time, % Privileged Time, % Processor Time, %User Time

Win32_PerfFormattedData_PerfDisk_LogicalDisk

Disk

Disk Space Used%, Disk Queue Length, Disk Reads/sec, Disk Writes/sec, Disk Read Bytes/sec, Disk Write Bytes/sec, Disk Transfes/sec

 Win32_DiskDrive

Disk

Disk Space Capacity

 Win32_ComputerSystem

Memory

Memory Capacity

 Win32_PerfRawData_Tcpip_NetworkInterface

Network

Output Queue Length, Packet Inbound/Outbound Errors, Kilobytes Sent/sec, Kiloybytes Received/sec

 Win32_Printer

Printer

Printer availability and status

 Win32_PerfRawData_PerfProc_Process

Process

Process name, ID, Elapsed time, % Privileged Time, %Processor Time, Page File Bytes, Thread Count

Win32_Service

Service

Service Exit Code, Started, Start Mode, Start Name, State

Win32_NTLogEvent

Event Log

Windows Eventlog Events

 

Typeperf

typeperf.exe has been around for some time and is a powerful command utility that writes performance data to logs, sends to the command window, and can be scripted to write to SQLServer.  If you like to script and need performance data typeperf is a great option.

For example:

typeperf "\Processor(_Total)\% Processor Time" -s Longitude-Demo

Here we targeted a remote server Longitude-Demo for the "%Processor time" counter

Collecting Windows IT Metrics with Typeperf

Perfmon

The “go to” program for many Windows performance problems is Perfmon.  The Perfmon interface allows you to see what is happening in real time and also comes with excellent graphics as well as well as providing the ability to report historical information.  (remember you can also use Get-Counter cmdlet to feed data to Perfmon as well)

For example:

Here we see Perfmon graphing processor time for a server “Longitude-Demo”

 Collecting IT Metrics with Perfmon

Perfmon can also be configured to alert via a command (i.e. MSG ) or with a script.

Collecting IT Metrics and alerting using Perfmon

 

For example: to alert on a Processor Usage.

  1. Launch  "perfmon"
  2. Open Data Collector Sets
  3. Open User Defined
  4. Create New Data Collector Set
  5. When wizard starts, select "Create manually...", Next
  6. Select Performance Alert, Next
  7. Click Add, Open Processor
  8. Select % Processor Time
  9. Click Add, OK
  10. Enter 85 in Limit, Next
  11. Finish
  12. Double-click your new data collector set to open it's properties
  13. Click Alert Task tab
  14. Enter your command or script that generates an alert notification
  15. Click OK

 

Script

WMI is also readily accessible via VBScript and other methods such as Perl.  There is a bit more “plumbing” to contend with as the PowerShell cmdlets abstract out some of the complexity.  If you’re going to be collecting a large number of metrics or are concerned about overhead, then coding may be the way to go. For example, VBScript uses slightly less resources than PowerShell, but you’ll need allocate quite a bit more time for the coding over PowerShell.  If you’re going to collect a large volume of metrics, you’ll want make sure your VBscript or PowerShell for that matter operate as multi-threads as this is the only way you’ll be able to collect and evaluate the data on a timely basis.

 

Conclusion:

As it applies to collecting metrics related to IT operations the challenge isn’t so much the mechanics of collecting the actual data itself, but what you do with the data once you have it. 

Fundamentally: 

  1. Is the data going to be archived for reporting?
  2. What kind of alerting/thresholds/escalation are going to be embed in the script?.

If you’re looking to archive the data for reporting, you’ll need to give consideration to the format (i.e. .csv or writing to a SQL database), the volume, and data retention. This will impact not only the method you use to collect the data, but how often you collect the data.

Alerting is straightforward when there are a limited number of metrics and thresholds.  When scripting the alerts, special consideration will need to be given to the alerting criteria itself.  As the number of servers and applications being monitored expands “one size fits” all thresholds can present challenges.  Too many false alerts and you’ll get into a bit of “Boy who cried wolf”, where the problem volume is so high that it is difficult to separate serious problems from the fodder.  Some upfront planning in terms of not only determining what metrics to target but what the most appropriate thresholds for alerting are  can go a long way towards a successful IT monitoring effort.

Want to learn more?

Download our Best Practices for Server Monitoring Whitepaper and learn how to achieve a successful long-term server monitoring strategy by focusing on an approach that is lightweight, efficient, resilient, and automated.

Download the whitepaper: Best Practices for Server Monitoring